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RELATED APPLICATIONS 

Benefit of priority to U.S. provisional application Serial No. 
60/294,758, filed May 30, 2001, to Perkins, et al.. entitled 
"CHROMOSOME-BASED PLATFORMS" and to U.S. provisional application 
Serial No. 60/366,891, filed March 21, 2002, to Perkins, eta/., entitled 
"CHROMOSOME-BASED PLATFORMS" is claimed. Where permitted, the 
subject matter of which are herein incorporated by reference in their 
entirety. 

This application is related to Provisional Application No. 
60/294,687, filed May 30, 2001, by CARL PEREZ AND STEVEN 
FABIJANSK1 entitled PLANT ARTIFICIAL CHROMOSOMES, USES 
THEREOF AND METHODS FOR PREPARING PLANT ARTIFICIAL 
CHROMOSOMES and to U.S. Provisional Application No. 60/296,329, 
filed June 4, 2001, by CARL PEREZ AND STEVEN FABIJANSKI entitled 
PLANT ARTIFICIAL CHROMOSOMES, USES THEREOF AND METHODS 
FOR PREPARING PLANT ARTIFICIAL CHROMOSOMES. This application 
also is related to U.S. Provisional Application No. 60/294,758, filed May- 
30, 2001, by EDWARD PERKINS eta/., entitled CHROMOSOME-BASED 
PLATFORMS and to U.S. Provisional Application No. 60/366,891, filed 
March 21, 2002, by by EDWARD PERKINS etaL. entitled 
CHROMOSOME-BASED PLATFORMS. This application is also related to 
U.S. application Serial Nos. (attorney dkt nos. 24601-419 and 419PC), 
filed on the same day herewith, entitled PLANT ARTIFICIAL 
CHROMOSOMES, USES THEREOF AND METHODS OF PREPARING 
PLANT ARTIFICIAL CHROMOSOMES to Perez et al. . 

This application is related to U.S. application Serial No. 
08/695,191, filed August 7, 1996 by GYULA HADLACZKY and ALADAR 
SZALAY, entitled ARTIFICIAL CHROMOSOMES, USES THEREOF AND 
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METHODS FOR PREPARING ARTIFICIAL CHROMOSOMES, now U.S. 
Patent No. 6,025,155. This application is also related to U.S. application 
Serial No. 08/682,080, filed July 15, 1996 by GYULA HADLACZKY and 
ALADAR SZALAY, entitled ARTIFICIAL CHROMOSOMES, USES THEREOF 
5 AND METHODS FOR PREPARING ARTIFICIAL CHROMOSOMES, now 
U.S. Patent No. 6,077,697. This application is also related U.S. 
application Serial No. 08/629,822, filed April 10, 1996 by GYULA 
HADLACZKY and ALADAR SZALAY, entitled ARTIFICIAL 
CHROMOSOMES, USES THEREOF AND METHODS FOR PREPARING 

lO ARTIFICIAL CHROMOSOMES (now abandoned), and is also related to 

copending U.S. application Serial No. 09/096,648, filed June 12, 1998, 
by GYULA HADLACZKY and ALADAR SZALAY, entitled ARTIFICIAL 
CHROMOSOMES, USES THEREOF AND METHODS FOR PREPARING 
ARTIFICIAL CHROMOSOMES and to U.S. application Serial No. 

15 09/835,682, April 10, 1997 by GYULA HADLACZKY and ALADAR 

SZALAY, entitled ARTIFICIAL CHROMOSOMES, USES THEREOF AND 
METHODS FOR PREPARING ARTIFICIAL CHROMOSOMES (now 
abandoned). This application is also related to copending U.S. application 
Serial No. 09/724,726, filed November 28, 2000, U.S. application Serial 

20 No. 09/724,872, filed November 28, 2000, U.S. application Serial No. 
09/724,693, filed November 28, 2000, U.S. application Serial No. 
09/799,462, filed March 5, 2001, U.S. application Serial No. 
09/836,911, filed April 17, 2001, and U.S. application Serial No. 
10/125,767, filed April 17, 2002, each of which is by GYULA 

25 HADLACZKY and ALADAR SZALAY, and is entitled ARTIFICIAL 

CHROMOSOMES, USES THEREOF AND METHODS FOR PREPARING 
ARTIFICIAL CHROMOSOMES. This application is also related to 
International PCT application No. WO 97/40183. Where permitted the 
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subject matter of each of these provisional applications, international 
applications, and applications is incorporated by reference in its entirety. 
FIELD OF INVENTION 

Artificial chromosomes, including ACes, that have been engineered 
5 to contain available sites for site-specific, recombination-directed 

integration of DNA of interest are provided. These artificial chromosomes 
permit tractable, efficient, rational engineering of the chromosome. 
BACKGROUND 

Artificial chromosomes 
.10 A variety of artificial chromosomes for use in plants and animals, 

particularly higher plants and animals are available. In particular, U.S. 
Patent Nos, 6,025,155 and 6,077,697 provide heterochromatic artificial 
chromosomes designated therein as satellite artificial chromosomes 
(SATACs) and now designated artificial chromosome expression systems 

15 {ACes). These chromosomes are prepared by introducing heterologous 
DNA into a selected plant or animal cell under conditions that result in 
integration into a region of the chromosome that leads to an amplification 
event resulting in production of a dicentric chromosome. Subsequent 
treatment and growth of cells with dicentric chromosomes, including 

20 further amplifications, ultimately results in the artificial chromosomes 

provided therein. In order to introduce a desired heterologous gene (or a 
plurality of heterologous genes) into the artificial chromosome, the 
process is repeated introducing the desired heterologous genes and 
nucleic acids in the initial targeting step. This process is time consuming 

25 and tedious. Hence, more tractable and efficient methods for introducing 
heterologous nucleic acid molecules into artificial chromosomes, 
particularly ACes, are needed. 
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Therefore, it is an object herein to provide engineered artificial 
chromosomes that permit tractable, efficient and rational engineering of 
artificial chromosomes. 
SUMMARY OF THE INVENTION 

Provided herein are artificial chromosomes that permit tractable, 
efficient and rational engineering thereof. In particular, the artificial 
chromosomes provided herein contain one or a plurality of loci (sites) for 
site-specific, recombination-directed integration of DNA. Thus, provided 
herein are platform artificial chromosome expression systems ("platform 
ACes n ) containing single or multiple site-specific, recombination sites. 
The artificial chromosomes and ACes artificial chromosomes include plant 
and animal chromosomes. Any recornbinase system that effects site- 
specific recombination is contemplated for use herein. 

In one embodiment, chromosomes, including platform ACes, are 
provided that contain one or more lambda att sites designed for 
recombination-directed integration in the presence of lambda integrase, 
and that are mutated so that they do not require additional factors. 
Methods for preparing such chromosomes, vectors for use in the 
methods, and uses of the resulting chromosomes are also provided. 

Platform ACes containing the recombination site(s) and methods for 
introducing heterologous nucleic acid into such sites and vectors therefor, 
are provided. 

Also provided herein is a bacteriophage lambda (A) integrase site- 
specific recombination system. 

Methods using recornbinase mediated recombination target gene 
expression vectors and/or genes for insertion thereof into platform 
chromosomes and the resulting chromosomes are provided- 
Combinations and kits containing the combinations of vectors 
encoding a recornbinase and integrase and primers for introduction of the 
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site recognized thereby are also provided. The kits optionally include 
instructions for performing site-directed integration or preparation of ACes 
containing such sites. 

Also provided herein are mammalian and plant cells comprising the 
artificial chromosomes and ACes described herein. The cells can be 
nuclear donor cells, stem cells, such as a mesenchymal stem cell, a 
hematopoietic stem cell, an adult stem cell or an embryonic stem cell. 

Also provide is a lamba-intR mutein comprising a glutamic acid to 
arginine change at position 174 of wild-type Iambda-integrase3. Also 
provided are transgenic animals and methods for producing a transgenic 
animal, comprising introducing a ACes into an embryonic cell, such as a 
stem cell or embryo. The ACes can comprise heterologous nucleic acid 
that encodes a therapeutic product. The transgenic animal can be a fish, 
insect, reptile, amphibians, arachnid or mammal. In certain embodiments, 
the ACes is introduced by cell fusion, lipid-mediated transfection by a 
carrier system, microinjection, microcell fusion, electroporation, 
microprojectile bombardment or direct DNA transfer. 

The platform ACes, including plant and animal ACes, such as 
MACs, provided herein can be introduced into cells, such as, but not 
limited to, animal cells, including mammalian cells, and into plant cells. 
Hence plant cells that contain platform MACs, animal cells that contain 
platform PACs and other combinations of cells and platform ACes are 
provided. 

DESCRIPTION OF FIGURES 

FIGURE 1 provides a diagram depicting creation of an exemplary 
ACes artificial chromosome prepared using methods detailed in U.S. 
Patent Nos. 6,025,155 and 6,077,697 and International PCT application 
No. WO 97/40183. In this exemplified embodiment, the nucleic acid is 
targeted to an acrocentric chromosome In an animal or plant, and the 



WO 02/097059 



PCI7US02/17452 



heterologous nucleic acid includes a sequence-specific recombination site 
and marker genes. 

FIGURE 2 provides a map of pWEPuro9K, which is a targeting 
vector derived from the vector pWE1 5 (GenBank Accession # X65279; 
5 SEQ ID No. 31). Plasmid pWE15 was modified by replacing the Sail 
(Klenow filled)/Smal neomycin resistance encoding fragment with the 
Pvu\\IBamH\ (Klenow filled) puromycin resistance-encoding fragment 
(isolated from plasmid pPUR, Clontech Laboratories, Inc., Palo Alto, CA; 
GenBank Accession no. U07648; SEQ ID No. 30) resulting in plasmid 
10 pWEPuro. Subsequently a 9 Kb Not\ fragment from the plasmid pFK161 
(see Example 1 , see, also Csonka et al. (2000) Journal of Cell Science 
//3:3207-32161; and SEQ ID NO: 1 18), containing a portion of the 
mouse rDNA region, was cloned into the Not\ site of pWEPuro resulting in 
plasmid pWEPuro9K. 
1 5 FIGURE 3 depicts construction of an ACes platform chromosome 

with a single recombination site, such as loxP sites or an attP or attB site. 
This platform ACes chromosome is an exemplary artificial chromosome 
with a single recombination site. 

FIGURE 4 provides a map of plasmid pSV40-1 93attPsensePur. 
20 FIGURE 5 depicts a method for formation of a chromosome 

platform with multiple recombination integration sites, such as attP sites. 

FIGURE 6 sets forth the sequences of the core region of attP, attB, 
attL and attH (SEQ ID Nos. 33-36). 

FIGURE 7 depicts insertional recombination of a vector encoding a 
25 marker gene, DsRed and an attB site with an artificial chromosome 
containing an attP site. 

FIGURE 8 provides a map of plasmid pCXLamlntR (SEQ ID NO: 
1 12), which includes the Lambda integrase (E1 74R)-encoding nucleic 
acid. 
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FIGURE 9 diagrammatically summarizes the platform technology; 
marker 1 permits selection of the artificial chromosomes containing the 
integration site; marker 2, which is promoterless in the target gene 
expression vector, permits selection of recombinants. Upon 
5 recombination with the platform marker 2 is expressed under the control 
of a promoter resident on the platform. 

FIGURE 10 provides the vector map for the plasmid p18attBZEO- 
5'6XHS4eGFP (SEQ ID NO: 116). 

FIGURE 1 1 provides the vector map for the plasmid p18attBZEO- 
lO 3'6XHS4eGFP (SEQ ID NO: 115). 

FIGURE 12 provides the vector map for the plasmid p18attBZEO- 
(6XHS4)2eGFP (SEQ ID NO: 110). 

FIGURES 13 AND 14 depict the integration of a PCR product by 
site-specific recombination as set forth in Example 8. 
1 5 FIGURE 1 5 provides the vector map for the plasmid pPACrDNA as 

set forth in Example 9. A. 

DETAILED DESCRIPTION OF THE INVENTION 
A. DEFINITIONS 

Unless defined otherwise, all technical and scientific terms used 
20 herein have the same meaning as is commonly understood by one of skill 
in the art to which the invention(s) belong. All patents, patent 
applications, published applications and publications, Genbank sequences, 
websites and other published materials referred to throughout the entire 
disclosure herein, unless noted otherwise, are incorporated by reference 
25 tn their entirety. Where reference is made to a URL or other such 

indentifier or address, it understood that such identifiers can change and 
particular information on the internet can come and go, but equivalent 
information can be found by searching the internet. Reference thereto 
evidences the availability and public dissemination of such information. 
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As used herein, nucleic acid refers to single-stranded and/or 
double-stranded polynucleotides, such as deoxyribonucleic acid (DNA) 
and ribonucleic acid (RNA), as well as analogs or derivatives of either 
RNA or DNA. Also included in the term "nucleic acid" are analogs of 
5 nucleic acids such as peptide nucleic acid (PNA), phosphorothioate DNA, 
and other such analogs and derivatives. When referring to probes or 
primers, optionally labeled, with a detectable label, such as a fluorescent 
or radiolabel, single-stranded molecules are contemplated. Such 
molecules are typically of a length such that they are statistically unique 

10 and of low copy number (typically less than 5, preferably less than 3) for 
probing or priming a library. Generally a probe or primer contains at least 
14, 16 or 30 contiguous nucleotides of sequence complementary to or 
identical to a gene of interest. Probes and primers can be 10, 20, 30, 50, 
100 or more nucleotides long. 

15 As used herein, DNA is meant to include all types and sizes of DNA 

molecules including cDNA, plasmids and DNA including modified 
nucleotides and nucleotide analogs. 

As used herein, nucleotides include nucleoside mono-, di-, and 
triphosphates. Nucleotides also include modified-nucleotides, such as, 

20 but are not limited to, phosphorothioate nucleotides and deazapurine 
nucleotides and other nucleotide analogs. 

As used herein, heterologous or foreign DNA and RNA are used 
interchangeably and refer to DNA or RNA that does not occur naturally as 
part of the genome in which it is present or which is found in a location 

25 or locations and/or in amounts in a genome or cell that differ from that in 
which it occurs in nature. Heterologous nucleic acid is generally not 
endogenous to the cell into which it is introduced, but has been obtained 
from another cell or prepared synthetically. Generally, although not 
necessarily, such nucleic acid encodes RNA and proteins that are not 
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normally produced by the cell in which it is expressed. Any DNA or RNA 
that one of skill in the art would recognize or consider as heterologous or 
foreign to the cell in which it is expressed is herein encompassed by 
heterologous DNA. Heterologous DNA and RNA may also encode RNA or 
5 proteins that mediate or alter expression of endogenous DNA by affecting 
transcription, translation, or other regulatable biochemical processes. 

Examples of heterologous DNA include, but are not limited to, DNA 
that encodes a gene product or gene product(s) of interest, introduced for 
purposes of modification of the endogenous genes or for production of an 

10 encoded protein. For example, a heterologous or foreign gene may be 
isolated from a different species than that of the host genome, or 
alternatively, may be isolated from the host genome but operably linked 
to one or more regulatory regions which differ from those found in the 
unaltered, native gene. Other examples of heterologous DNA include, but 

15 are not limited to, DNA that encodes traceable marker proteins, such as a 
protein that confers traits including, but not limited to, herbicide, insect, 
or disease resistance; traits, including, but not limited to, oil quality or 
carbohydrate composition. Antibodies that are encoded by heterologous 
DNA may be secreted or expressed on the surface of the cell in which the 

20 heterologous DNA has been introduced. 

As used herein, operative linkage or operative association, or 
grammatical variations thereof, of heterologous DNA to regulatory and 
effector sequences of nucleotides, such as promoters, enhancers, 
transcriptional and translational stop sites, and other signal sequences 

25 refers to the relationship between such DNA and such sequences of 

nucleotides. For example, operative linkage of heterologous DNA to a 
promoter refers to the physical relationship between the DNA and the 
promoter such that the transcription of such DNA is initiated from the 
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promoter by an RNA polymerase that specifically recognizes, binds to and 
transcribes the DNA. 

In order to optimize expression and/or in vitro transcription, it may 
be necessary to remove, add or alter 5' untranslated portions of the 
5 clones to eliminate extra, potential inappropriate alternative translation 
initiation (i.e., start) codons or other sequences that may interfere with or 
reduce expression, either at the level of transcription or translation. 
Alternatively, consensus ribosome binding sites (see, e.g., Kozak (1991) 
J. Biol. Chem. 19867-1 9870) can be inserted immediately 5' of the 

10 start codon and may enhance expression. 

As used herein, a sequence complementary to at least a portion of 
an RNA, with reference to antisense oligonucleotides, means a sequence 
having sufficient complementarity to be able to hybridize with the RNA, 
preferably under moderate or high stringency conditions, forming a stable 

15 duplex. The ability to hybridize depends on the degree of 

complementarity and the length of the antisense nucleic acid. The longer 
the hybridizing nucleic acid, the more base mismatches it can contain and 
still form a stable duplex (or triplex, as the case may be). One skilled in 
the art can ascertain a tolerable degree of mismatch by use of standard 

20 procedures to determine the melting point of the hybridized complex. 
As used herein, regulatory molecule refers to a polymer of 
deoxyribonucleic acid (DNA) or ribonucleic acid (RNA) or a polypeptide 
that is capable of enhancing or inhibiting expression of a gene. 

As used herein, recognition sequences are particular sequences of 

25 nucleotides that a protein, DNA, or RNA molecule, or combinations 
thereof, (such as, but not limited to, a restriction endonuclease, a 
modification methylase and a recombinase) recognizes and binds. For 
example, a recognition sequence for Cre recombinase (see, e.g., SEQ ID 
NO:58) is a 34 base pair sequence containing two 13 base pair inverted 
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repeats (serving as the recornbinase binding sites) flanking an 8 base pair 
core and designated loxP (see, e.g., Sauer (1994) Current Opinion in 
Biotechnology 5:521-527). Other examples of recognition sequences, 
include, but are not limited to, attB and attP, attR and attL and others 
5 (see, e.g., SEQ ID Nos. 8, 41-56 and 72), that are recognized by the 
recornbinase enzyme Integrase (see, SEQ ID Nos. 37 and 38 for the 
nucleotide and encoded amino acid sequences of an exemplary lambda 
phage integrase). 

The recombination site designated attB is an approximately 33 base 
10 pair sequence containing two 9 base pair core-type Int binding sites and a 
7 base pair overlap region; attP (SEQ ID No. 72) is an approximately 240 
base pair sequence containing core-type Int binding sites and arm-type Int 
binding sites as well as sites for auxiliary proteins IHF, FIS, and Xis (see, 
e.g., Landy (1993) Current Opinion in Biotechnology 5:699-7071 see, 
15 e.g., SEQ ID Nos. 8 and 72). 

As used herein, a recornbinase is an enzyme that catalyzes the 
exchange of DNA segments at specific recombination sites. An integrase 
herein refers to a recornbinase that is a member of the lambda (A) 
integrase family. 

20 As used herein, recombination proteins include excisive proteins, 

integrative proteins, enzymes, co-factors and associated proteins that are 
involved in recombination reactions using one or more recombination sites 
(see, Landy (1 993) Current Opinion in Biotechnology 3:699-707). The 
recombination proteins used herein can be delivered to a cell via an 

25 expression cassette on an appropriate vector, such as a plasmid, and the 
like. In other embodiments, the recombination proteins can be delivered 
to a cell in protein form in the same reaction mixture used to deliver the 
desired nucleic acid, such as a platform ACes, donor target vectors, and 
the like. 
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As used herein the expression "lox site" means a sequence of 
nucleotides at which the gene product of the ere gene, referred to 
herein as Cre, can catalyze a site-specific recombination event. A LoxP 
site is a 34- base pair nucleotide sequence from bacteriophage P1 (see, 
5 e.g., Hoess eta!. (1982) Proc. Natl. Acad. Set. U.S.A. 73:3398-3402). 
The LoxP site contains two 13 base pair inverted repeats separated by an 
8 base pair spacer region as follows: (SEQ ID NO. 57): 

ATAACTTCGTATA ATGTATGC TATACGAAGTTAT 
E. co//DH5Alac and yeast strain BSY23 transformed with plasmid pBS44 

10 carrying two loxP sites connected with a LEU2 gene are available from 
the American Type Culture Collection (ATCC) under accession numbers 
ATCC 53254 and ATCC 20773, respectively. The lox sites can be 
isolated from plasmid pBS44 with restriction enzymes EcoR\ and Sail, or 
Xho\ and BamW\. In addition, a preselected DNA segment can be inserted 

15 into pBS44 at either the Sal\ or BamYW restriction enzyme sites. Other lox 
sites include, but are not limited to, LoxB, LoxL, LoxC2 and LoxR sites, 
which are nucleotide sequences isolated from E. coll (see, e.g., Hoess et 
a/. (1982) Proc. Natl. Acad. Sci. U.S.A. 7^:3398). Lox sites can also be 
produced by a variety of synthetic techniques (see, e.g., Ito et al. (1982) 

20 Nuc. Acid Res. 70/1755 and Ogilvie et al. (1981) Science 270:270). 

As used herein, the expression "cre gene" means a sequence of 
nucleotides that encodes a gene product that effects site-specific 
recombination of DNA in eukaryotic cells at lox sites. One cre gene can 
be isolated from bacteriophage P1 (see, e.g., Abremski et al. (1983) Cell 

25 52:1301-131 1). E. coll DH1 and yeast strain BSY90 transformed with 

plasmid pBS39 carrying a cre gene isolated from bacteriophage PI and a 
GAL1 regulatory nucleotide sequence are available from the American 
Type Culture Collection (ATCC) under accession numbers ATCC 53255 
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and ATCC 20772, respectively. The ere gene can be isolated from 
plasrnid pBS39 with restriction enzymes Xho\ and Sai\. 

As used herein, site-specific recombination refers to site-specific 
recombination that is effected between two specific sites on a single 
5 nucleic acid molecule or between two different molecules that requires 
the presence of an exogenous protein, such as an integrase or 
recombinase. 

For example, Cre-lox site-specific recombination can include the 
following three events: 
10 a. deletion of a pre-selected DNA segment flanked by lox 

sites; 

b. inversion of the nucleotide sequence of a pre-selected 
DNA segment flanked by lox sites; and 

c. reciprocal exchange of DNA segments proximate to 
1 5 lox sites located on different DNA molecules. 

This reciprocal exchange of DNA segments can result in an 
integration event if one or both of the DNA molecules are circular- DNA 
segment refers to a linear fragment of single- or double-stranded 
deoxyribonucleic acid (DNA), which can be derived from any source. 

20 Since the lox site is an asymmetrical nucleotide sequence, two lox sites 
on the same DNA molecule can have the same or opposite orientations 
with respect to each other. Recombination between lox sites in the same 
orientation results in a deletion of the DNA segment located between the 
two lox sites and a connection between the resulting ends of the original 

25 DNA molecule. The deleted DNA segment forms a circular molecule of 

DNA. The original DNA molecule and the resulting circular molecule each 
contain a single lox site. Recombination between lox sites in opposite 
orientations on the same DNA molecule result in an inversion of the 
nucleotide sequence of the DNA segment located between the two lox 
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sites. In addition, reciprocal exchange of DNA segments proximate to lox 
sites located on two different DNA molecules can occur. All of these 
recombination events are catalyzed by the gene product of the ere gene. 
Thus, the Cre-lox system can be used to specifically delete / invert, or 
5 insert DNA. The precise event is controlled by the orientation of lox DNA 
sequences, in c/s the lox sequences direct the Cre recombinase to either 
delete (lox sequences in direct orientation) or invert (lox sequences in 
inverted orientation) DNA flanked by the sequences, while in trans the lox 
sequences can direct a homologous recombination event resulting in the 

1 0 insertion of a recombinant DNA. 

As used herein, a chromosome is a nucleic acid molecule, and 
associated proteins, that is capable of replication and segregation within a 
cell upon cell division. Typically, a chromosome contains a centromeric 
region, replication origins, telomeric regions and a region of nucleic acid 

15 between the centromeric and telomeric regions. 

As used herein, a centromere is any nucleic acid sequence that 
confers an ability to segregate to daughter cells through cell division. A 
centromere may confer stable segregation of a nucleic acid sequence, 
including an artificial chromosome containing the centromere, through 

20 mitotic or meiotic divisions, including through both mitotic and meiotic 
divisions. A particular centromere is not necessarily derived from the 
same species in which it is introduced, but has the ability to promote 
DNA segregation in cells of that species. 

As used herein, euchromatin and heterochromatin have their 

25 recognized meanings. Euchromatin refers to chromatin that stains 

diffusely and that typically contains genes, and heterochromatin refers to 
chromatin that remains unusually condensed and that has been thought to 
be transcriptionally inactive. Highly repetitive DNA sequences (satellite 
DNA) are usually located in regions of the heterochromatin surrounding 
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the centromere {pericentric or pericentromeric heterochromatin). 
Constitutive heterochromatin refers to heterochromatin that contains the 
highly repetitive DNA which is constitutively condensed and genetically 
inactive. 

5 As used herein, an acrocentric chromosome refers to a 

chromosome with arms of unequal length. 

As used herein, endogenous chromosomes refer to genomic chrom- 
osomes as found in a cell prior to generation or introduction of an artificial 
chromosome. 

10 As used herein, artificial chromosomes are nucleic acid molecules, 

typically DNA, that stably replicate and segregate alongside endogenous 
chromosomes in cells and have the capacity to accommodate and express 
heterologous genes contained therein. It has the capacity to act as a 
gene delivery vehicle by accommodating and expressing foreign genes 

15 contained therein. A mammalian artificial chromosome (MAC) refers to 
chromosomes that have an active mammalian centromere(s). Plant 
artificial chromosomes, insect artificial chromosomes and avian artificial 
chromosomes refer to chromosomes that include centromeres that 
function in plant, insect and avian cells, respectively. A human artificial 

20 chromosome (HAC) refers to chromosomes that include centromeres that 
function in human cells. For exemplary artificial chromosomes, see, e.g., 
U.S. Patent Nos. 6,025,155; 6,077,697; 5,288,625; 5,712,134; 
5,695,967; 5,869,294; 5,891,691 and 5,721,118 and published 
International PCT application Nos, WO 97/40183 and WO 98/08964. 

25 Artificial chromosomes include those that are predominantly 

heterochromatic (formerly referred to as satellite artificial chromosomes 
(SATACs); see, e.g., U.S. Patent Nos. 6,077,697 and 6,025,155 and 
published International PCT application No. WO 97/40183), 
minichromosomes that contain a de novo centromere (see, U.S. Patent 
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Nos. 5,712,134, 5,891,691 and 5,288,625), artificial chromosomes 
predominantly made up of repeating nucleic acid units and that contain 
substantially equivalent amounts of euchromatic and heterochromatic 
DNA and in vitro assembled artificial chromosomes (see, copending U.S. 
5 provisional application Serial No. 60/294,687, filed on May 30, 2001). 
As used herein, the term "satellite DNA-based artificial 
chromosome (SATAC)" is interchangable with the term "artificial 
chromosome expression system (ACes)". These artificial chromosomes 
(ACes) include those that are substantially all neutral non-coding 

10 sequences (heterochromatin) except for foreign heterologous, typically 
gene-encoding nucleic acid, that is interspersed within the 
heterochromatin for the expression therein (see U.S. Patent Nos. 
6,025,155 and 6,077,697 and International PCT application No. WO 
97/40183), or that is in a single locus as provided herein. Also included 

1 5 are ACes that may include euchromatin and that result from the process 
described in U.S. Patent Nos. 6,025,155 and 6,077,697 and international 
PCT application No. WO 97/40183 and outlined herein. The delineating 
structural feature is the presence of repeating units, that are generally 
predominantly heterochromatin. The precise structure of the ACes will 

20 depend upon the structure of the chromosome in which the initial 

amplification event occurs; all share the common feature of including a 
defined pattern of repeating units. Generally ACes have more 
heterochromatin than euchromatin. Foreign nucleic acid molecules 
(heterologous genes) contained in these artificial chromosome expression 

25 systems can include any nucleic acid whose expression is of interest in a 
particular host cell. Such foreign nucleic acid molecules, include, but are 
not limited to, nucleic acid that encodes traceable marker proteins 
(reporter genes), such as fluorescent proteins, such as green, blue or red 
fluorescent proteins (GFP, BFP and RFP, respectively), other reporter 
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genes, such as /?-galactosidase and proteins that confer drug resistance, 
such as a gene encoding hygromycin-resistance. Other examples of 
heterologous nucleic acid molecules include, but are not limited to, DNA 
that encodes therapeutically effective substances, such as anti-cancer 
5 agents, enzymes and hormones, DNA that encodes other types of 

proteins, such as antibodies, and DNA that encodes RNA molecules (such 
as antisense or siRNA molecules) that are not translated into proteins. 

As used herein, an artificial chromosome platform, also referred to 

10 herein as a "platform ACes" or "ACes platform", refers to an artificial 
chromosome that has been engineered to include one or more sites for 
site-specific, recombination-directed integration. In particular, ACes that 
are so-engineered are provided. Any sites, including but not limited to 
any described herein, that are suitable for such integration are 

15 contemplated. Plant and animal platform ACes are provided. Among the 
ACes contemplated herein are those that are predominantly 
heterochromatic (formerly referred to as satellite artificial chromosomes 
(SATACs); see, e.g., U.S. Patent Nos. 6,077,697 and 6,025,155 and 
published International PCT application No. WO 97/40183), artificial 

20 chromosomes predominantly made up of repeating nucleic acid units and 
that contain substantially equivalent amounts of euchromatic and 
heterochromatic DNA resulting from an amplification event depicted in the 
referenced patent and herein. Included among the ACes for use in 
generating platforms, are artificial chromosomes that introduce and 

25 express heterologous nucleic acids in plants (see, copending U.S. 

provisional application Serial No. 60/294,687, filed on May 30, 2001). 
These include artificial chromosomes that have a centromere derived from 
a plant, and, also, artificial chromosomes that have centromeres that may 
be derived from other organisms but that function in plants. 
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As used herein a "reporter ACes" refers to a an ACes that 
comprises one or a plurality of reporter constructs, where the reporter 
construct comprises a reporter gene in operative linkage with a regulatory 
region responsive to test or known compounds. 
5 As used herein, amplification, with reference to DNA, is a process 

in which segments of DNA are duplicated to yield two or multiple copies 
of substantially similar or identical or nearly identical DNA segments that 
are typically joined as substantially tandem or successive repeats or 
inverted repeats. 

10 As used herein, amplification-based artificial chromosomes are 

artificial chromosomes derived from natural or endogenous chromosomes 
by virtue of an amplification event, such as one initiated by introduction 
of heterologous nucleic acid into rDNA in a chromosome. As a result of 
such an event, chromosomes and fragments thereof exhibiting segmented 

15 or repeating patterns arise. Artificial chromosomes can be formed from 
these chromosomes and fragments. Hence, amplification-based artificial 
chromosomes refer to engineered chromosomes that exhibit an ordered 
segmentation that is not observed in naturally occurring chromosomes 
and that distinguishes them from naturally occurring chromosomes. The 

20 segmentation, which can be visualized using a variety of chromosome 

analysis techniques known to those of skill in the art, correlates with the 
structure of these artificial chromosomes. In addition to containing one or 
more centromeres, the amplification-based artificial chromosomes, 
throughout the region or regions of segmentation are predominantly made 

25 up of nucleic acid units also referred to as "amplicons", that is (are) 

repeated in the region and that have a similar gross structure. Repeats of 
an amplicon tend to be of similar size and share some common nucleic 
acid sequences. For example, each repeat of an amplicon may contain a 
replication site involved in amplification of chromosome segments and/or 
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some heterologous nucleic acid that was utilized in the initial production 
of the artificial chromosome. Typically, the repeating units are 
substantially similar in nucleic acid composition and may be nearly 
identical. 

5 The amplification-based artificial chromosomes differ depending on 

the chromosomal region that has undergone amplification in the process 
of artificial chromosome formation. The structures of the resulting 
chromosomes can vary depending upon the initiating event and/or the 
conditions under which the heterologous nucleic acid is introduced, 

10 including modification to the endogenous chromosomes. For example, in 
some of the artificial chromosomes provided herein, the region or regions 
of segmentation may be made up predominantly of heterochromatic DNA. 
In other artificial chromosomes provided herein, the region or regions of 
segmentation may be made up predominantly of euchromatic DNA or may 

15 be made up of similar amounts of heterochromatic and euchromatic DNA. 
As used herein an amplicon is a repeated nucleic acid unit. In 
some of the artificial chromosomes described herein, an amplicon may 
contain a set of inverted repeats of a megareplicon. A megareplicon 
represents a higher order replication unit. For example, with reference to 

20 some of the predominantly heterochromatic artificial chromosomes, the 
megareplicon can contain a set of tandem DNA blocks {e.g., — 7.5 Mb 
DNA blocks) each containing satellite DNA flanked by non-satellite DNA 
or may be made up of substantially rDNA. Contained within the 
megareplicon is a primary replication site, referred to as the 

25 megareplicator, which may be involved in organizing and facilitating 
replication of the pericentric heterochromatin and possibly the 
centromeres. Within the megareplicon there may be smaller [e.g., 50-300 
kb) secondary repiicons. 
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In artificial chromosomes, such as those provided U.S. Patent Nos. 
6,025,155 and 6,077,697 and International PCT application No. WO 
97/40183, the megareplicon is defined by two tandem blocks (~7.5 Mb 
DNA blocks in the chromosomes provided therein). Within each artificial 
5 chromosome or among a population thereof, each amplicon has the same 
gross structure but may contain sequence variations. Such variations will 
arise as a result of movement of mobile genetic elements, deletions or 
insertions or mutations that arise, particularly in culture. Such variation 
does not affect the use of the artificial chromosomes or their overall 

10 structure as described herein. 

As used herein, amplifiable, when used in reference to a 
chromosome, particularly the method of generating artificial chromosomes 
provided herein, refers to a region of a chromosome that is prone to 
amplification. Amplification typically occurs during replication and other 

15 cellular events involving recombination (e.g. , DNA repair). Such regions 
include regions of the chromosome that contain tandem repeats, such as 
satellite DNA, rDNA, and other such sequences. 

As used herein, a dicentric chromosome is a chromosome that 
contains two centromeres. A multicentric chromosome contains more 

20 than two centromeres. 

As used herein, a formerly dicentric chromosome is a chromosome 
that is produced when a dicentric chromosome fragments and acquires 
new telomeres so that two chromosomes, each having one of the 
centromeres, are produced. Each of the fragments is a replicable 

25 chromosome. If one of the chromosomes undergoes amplification of 

primarily euchromatic DNA to produce a fully functional chromosome that 
is predominantly (at least more than 50%) euchromatin, it is a 
minichromosome. The remaining chromosome is a formerly dicentric 
chromosome. If one of the chromosomes undergoes amplification, 
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whereby heterochromatin (such as, for example, satellite DNA) is 
amplified and a euchromatic portion (such as, for example, an arm) 
remains, it is referred to as a sausage chromosome. A chromosome that 
is substantially all heterochromatin, except for portions of heterologous 
5 DNA, is called a predominantly heterochromatic artificial chromosome. 
Predominantly heterochromatic artificial chromosomes can be produced 
from other partially heterochromatic artificial chromosomes by culturing 
the cell containing such chromosomes under conditions such as Brdli 
treatment that destabilize the chromosome and/or growth under selective 

10 conditions so that a predominantly heterochromatic artificial chromosome 
is produced. For purposes herein, it is understood that the artificial 
chromosomes may not necessarily be produced in multiple steps, but may 
appear after the initial introduction of the heterologous DNA. Typically, 
artificial chromosomes appear after about 5 to about 60, or about 5 to 

15 about 55, or about 10 to about 55 or about 25 to about 55 or about 35 
to about 55 cell doublings after initiation of artificial chromosome 
generation, or they may appear after several cycles of growth under 
selective conditions and BrdU treatment. 

As used herein, an artificial chromosome that is predominantly 

20 heterochromatic (Le. , containing more heterochromatin than euchromatin, 
typically more than about 50%, more than about 70%, or more than 
about 90% heterochromatin) may be produced by introducing nucleic acid 
molecules into cells, such as, for example, animal or plant cells, and 
selecting cells that contain a predominantly heterochromatic artificial 

25 chromosome. Any nucleic acid may be introduced into cells in such 
methods of producing the artificial chromosomes. For example, the 
nucleic acid may contain a selectable marker and/or optionally a sequence 
that targets nucleic acid to the pericentric, heterochromatic region of a 
chromosome, such as in the short arm of acrocentric chromosomes and 
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nucleolar organizing regions. Targeting sequences include, but are not 
limited to, lambda phage DNA and rDNA for production of predominantly 
heterochromatic artificial chromosomes in eukaryotic cells. 

After introducing the nucleic acid into cells, a cell containing a 
5 predominantly heterochromatic artificial chromosome is selected. Such 
cells may be identified using a variety of procedures. For example, 
repeating units of heterochromatic DNA of these chromosomes may be 
discerned by G-banding and/or fluorescence in situ hybridization (FISH) 
techniques. Prior to such analyses, the cells to be analyzed may be 

lO enriched with artificial chromosome-containing cells by sorting the cells 
on the basis of the presence of a selectable marker, such as a reporter 
protein, or by growing (culturing) the cells under selective conditions. It 
is also possible, after introduction of nucleic acids into cells, to select 
cells that have a multicentric, typically dicentric, chromosome, a formerly 

15 multicentric (typically dicentric) chromosome and/or various 

heterochromatic structures, such as a megachromosome and a sausage 
chromosome, that contain a centromere and are predominantly 
heterochromatic and to treat them such that desired artificial 
chromosomes are produced. Cells containing a new chromosome are 

20 selected. Conditions for generation of a desired structure include, but are 
not limited to, further growth under selective conditions, introduction of 
additional nucleic acid molecules and/or growth under selective conditions 
and treatment with destabilizing agents, and other such methods (see 
International PCT application No. WO 97/40183 and U.S. Patent Nos. 

25 6,025,155 and 6,077,697). 

As used herein, a "selectable marker" is a nucleic acid segment, 
generally DNA, that allows one to select for or against a molecule or a 
eel! that contains it, often under particular conditions. These markers can 
encode an activity, such as, but not limited to, production of RNA, 
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peptide, or protein, or can provide a binding site for RNA, peptides, 
proteins, inorganic and organic compounds and compositions. Examples 
of selectable markers include but are not limited to: (1) nucleic acid 
segments that encode products that provide resistance against otherwise 
5 toxic compounds (e.g., antibiotics); (2) nucleic acid segments that encode 
products that are otherwise lacking in the recipient cell (e.g., tRNA genes, 
auxotrophic markers); (3) nucleic acid segments that encode products 
that suppress the activity of a gene product; (4) nucleic acid segments 
that encode products that can be identified, such as phenotypic markers, 

10 including yff-galactosidase, red, blue and/or green fluorescent proteins 
(FPs), and cell surface proteins; (5) nucleic acid segments that bind 
products that are otherwise detrimental to cell survival and/or function; 
(6) nucleic acid segments that otherwise inhibit the activity of any of the 
nucleic acid segments described in Nos. 1-5 above (e.g., antisense 

15 oligonucleotides or siRNA molecules for use in RNA interference); (7) 
nucleic acid segments that bind products that modify a substrate (e.g. 
restriction endonucleases); (8) nucleic acid segments that can be used to 
isolate a desired molecule (e.g. specific protein binding sites); (9) nucleic 
acid segments that encode a specific nucleotide sequence that can be 

20 otherwise non-functional, such as for PCR amplification of subpopulations 
of molecules; and/or (10) nucleic acid segments, which when absent, 
directly or indirectly confer sensitivity to particular compounds. Thus, for 
example, selectable markers include nucleic acids encoding fluorescent 
proteins, such as green fluorescent proteins, >?-galactosidase and other 

25 readily detectable proteins, such as chromogenic proteins or proteins 
capable of being bound by an antibody and FACs sorted. Selectable 
markers. such as these, which are not required for cell survival and/or 
proliferation in the presence of a selection agent, are also referred to 
herein as reporter molecules. Other selectable markers, e.g., the 
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neomycin phosphotransferase gene, provide for isolation and identification 
of cells containing them by conferring properties on the cells that make 
them resistant to an agent, e.g., a drug such as an antibiotic, that inhibits 
proliferation of cells that do not contain the marker. 
5 As another example, interference of gene expression by double 

stranded RNA has been shown in Caenorhabditis elegans, plants, 
Drosophila, protozoans and mammals. This method is known as RNA 
interference (RNAi) and utilizes short, double-stranded RNA molecules 
(siRNAs). The siRNAs are generally composed of a 19-22bp double- 

10 stranded RNA stem, a loop region and a 1-4 bp overhang on the 3' end. 
The reduction of gene expression has been accomplished by direct 
introduction of the siRNAs into the cell (Harborth J et aL, 2001, J Cell Sci 
114(pt 24):4557-65) as well as the introduction of DNA encoding and 
expressing the siRNA molecule. The encoded siRNA molecules are under 

15 the regulation of an RNA polymerase III promoter (see, e.g., Yu et aL, 
2002, Proc Natl Acad Sci USA 99(9);6047-52; Brummelkamp et aL, 
2002, Science 296(5567) :550-3; Miyagishi et aL, 2002, Nat Biotechnol 
20(5):497-500; and the like). In certain embodiments, RNAi in 
mammalian cells may have advantages over other therapeutic methods. 

20 For example, producing siRNA molecules that block viral genetic activities 
in infected cells may reduce the effects of the virus. Platform ACes 
provided herein encoding siRNA molecule(s) are an additional utilization of 
the platform ACes technology. The platform ACes could be engineered to 
encode one or more siRNA molecules to create gene "knockdowns". In 

25 one embodiment, a platform ACes can engineered to encode both the 

siRNA molecule and a replacement gene. For example, a mouse model or 
cell culture system could be generated using a platform ACes that has a 
knockdown of the endogenous mouse gene, by siRNA, and the human 
gene homolog expressing in place of the mouse gene. The placement of 
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10 



15 



20 



siRNA encoding sequences under the regulation of a regulatable or 
inducible promoter would allow one to temporally and/or spatially control 
the knockdown effect of the corresponding gene. 

As used herein, a reporter gene includes any gene that expresses a 
detectable gene product, which may be RNA or protein. Generally 
reporter genes are readily detectable. Examples of reporter genes include, 
but are not limited to nucleic acid encoding a fluorescent protein, CAT 
(chloramphenicol acetyl transferase) (Alton et aL (1979) Nature 282: 864- 
869) luciferase, and other enzyme detection systems, such as beta- 
galactosidase; firefly luciferase (deWet eta/. (1987) MoL Cell. Biol. 
7:725-137); bacterial luciferase (Engebrecht and Silverman (1984) Proc. 
Natl. Acad. ScL U.S.A. £7:4154-4158; Baldwin et al. (1984) 
Biochemistry 23:3663-3667); and alkaline phosphatase (Toh et al. (1989) 
Eur. J. Biochem. 752:231-238, Hall et aL (1983) J. MoL Appl. Gen. 
2:101). 

As used herein, growth under selective conditions means growth of 
a cell under conditions that require expression of a selectable marker for 
survival. 

As used herein, an agent that destabilizes a chromosome is any 
agent known by those skilled in the art to enhance amplification events, 
and/or mutations. Such agents, which include BrdU, are well known to 
those skilled in the art. 

In order to generate an artificial chromosome containing a particular 
heterologous nucleic acid of interest, it is possible to include the nucleic 
acid in the nucleic acid that is being introduced into cells to initiate 
production of the artificial chromosome. Thus, for example, a nucleic 
acid can be introduced into a cell along with nucleic acid encoding a 
selectable marker and/or a nucleic acid that targets to a heterochromatic 
region of a chromosome. For introducing a heterologous nucleic acid into 
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the cell, it can be included in a fragment that includes a selectable marker 
or as part of a separate nucleic acid fragment and introduced into the cell 
with a selectable marker during the process of generating the artificial 
chromosomes. Alternatively, heterologous nucleic acid can be introduced 
5 into an artificial chromosome at a later time after the initial generation of 
the artificial chromosome. 

As used herein, the minichromosome refers to a chromosome 
derived from a multicentric, typically dicentric, chromosome that contains 
more euchromatic than heterochromatic DNA. For purposes herein, the 

10 minichromosome contains a de novo centromere (e.g., a neocentromere). 
In some embodiments, for example, the minichromosome contains a 
centromere that replicates in animals, e.g., a mammalian centromere or in 
plants, e.g., a plant centromere. 

As used herein, in vitro assembled artificial chromosomes or 

1 5 synthetic chromosomes can be either more euchromatic than 

heterochromatic or more heterochromatic than euchromatic and are 
produced by joining essential components of a chromosome in vitro. 
These components include at least a centromere, a megareplicator, a 
telomere and optionally secondary origins of replication. 

20 As used herein, in vitro assembled plant or animal artificial 

chromosomes are produced by joining essential components (at least the 
centromere, telomere(s), megareplicator and optional secondary origins of 
replication) that function in plants or animals. In particular embodiments, 
the megareplicator contains sequences of rDNA, particularly plant or 

25 animal rDNA. 

As used herein, a plant is a eukaryotic organism that contains, in 
addition to a nucleus and mitochondria, chloroplasts capable of carrying 
out photosynthesis. A plant can be unicellular or multicellular and can 
contain multiple tissues and/or organs. Plants can reproduce sexually or 
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asexually and can be perennial or annual in growth. Plants can also be 
terrestrial or aquatic. The term "plant" includes a whole plant, plant cell, 
plant protoplast, plant calli, plant seed, plant organ, plant tissue, and 
other parts of a whole plant. 
5 As used herein, stable maintenance of chromosomes occurs when 

at least about 85%, preferably 90%, more preferably 95%, of the cells 
retain the chromosome. Stability is measured in the presence of a 
selective agent. Preferably these chromosomes are also maintained in the 
absence of a selective agent. Stable chromosomes also retain their 

10 structure during cell culturing, suffering no unintended intrachromosomal 
or interchromosomal rearrangements. 

As used herein, de novo with reference to a centromere, refers to 
generation of an excess centromere in a chromosome as a result of 
incorporation of a heterologous nucleic acid fragment using the methods 

15 herein. 

As used herein, BrdU refers to 5-bromodeoxyuridine, which during 
replication is inserted in place of thymidine. BrdU is used as a mutagen; it 
also inhibits condensation of metaphase chromosomes during cell 
division. 

20 As used herein, ribosomal RNA (rRNA) is the specialized RNA that 

forms part of the structure of a ribosome and participates in the synthesis 
of proteins. Ribosomal RNA is produced by transcription of genes which, 
in eukaryotic cells, are present in multiple copies. In human cells, the 
approximately 250 copies of rRNA genes (i.e., genes which encode rRNA) 

25 per haploid genome are spread out in clusters on at least five different 

chromosomes (chromosomes 13, 14, 15, 21 and 22). In mouse cells, the 
presence of ribosomal DNA (rDNA, which is DNA containing sequences 
that encode rRNA) has been verified on at least 1 1 pairs out of 20 mouse 
chromosomes (chromosomes 5, 6, 7, 9, 11, 12, 15, 16, 17, 18, and 19) 
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(see e.g., Rowe eta/. (1996) Mamm. Genome 7:886-889 and Johnson et 
a/. (1993) Mamm. Genome 4:49-52). In Arabidopsis thaliand the 
presence of rDNA has been verified on chromosomes 2 and 4 (18S, 5.8S, 
and 25S rDNA) and on chromosomes 3,4, and 5 (5S rDNAMsee The 
5 Arabidopsis Genome Initiative (2O0O) Nature 408:796-815). In 

eukaryotic cells, the multiple copies of the highly conserved rRNA genes 
are located in a tandemly arranged series of rDNA units, which are 
generally about 40-45 kb in length and contain a transcribed region and a 
nontranscribed region known as spacer (i.e., intergenic spacer) DNA 

10 which can vary in length and sequence. In the human and mouse, these 
tandem arrays of rDNA units are located adjacent to the pericentric 
satellite DNA sequences (heterochromatin). The regions of these 
chromosomes in which the rDNA is located are referred to as nucleolar 
organizing regions (NOR) which loop into the nucleolus, the site of 

1 5 ribosome production within the cell nucleus. 

As used herein, a megachromosome refers to a chromosome that, 
except for introduced heterologous DNA, is substantially composed of 
heterochromatin. Megachromosomes are made up of an array of repeated 
amplicons that contain two inverted megareplicons bordered by 

20 introduced heterologous DNA (see, e.g., Figure 3 of U.S. Patent No. 
6,077,697 for a schematic drawing of a megachromosome). For 
purposes herein, a megachromosome is about 50 to 400 Mb, generally 
about 250-400 Mb. Shorter variants are also referred to as truncated 
megachromosomes (about 90 to 120 or 150 Mb), dwarf 

25 megachromosomes ( — 150-200 Mb), and a micro-megachromosome 
( — 50-90 Mb, typically 50-60 Mb). For purposes herein, the term 
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megachromosome refers to the overall repeated structure based on an 
array of repeated chromosomal segments (amplicons) that contain two 
inverted megareplicons bordered by any inserted heterologous DNA. The 
size will be specified. 
5 As used herein, gene therapy involves the transfer or insertion of 

nucleic acid molecules into certain cells, which are also referred to as 
target cells, to produce specific products that are involved in preventing, 
curing, correcting, controlling or modulating diseases, disorders and 
deleterious conditions. The nucleic acid is introduced into the selected 

1 0 target cells in a manner such that the nucleic acid is expressed and a 

product encoded thereby is produced. Alternatively, the nucleic acid may 
in some manner mediate expression of DNA that encodes a therapeutic 
product. This product may be a therapeutic compound, which is 
produced in therapeutically effective amounts or at a therapeutically 

1 5 useful time. It may also encode a product, such as a peptide or RNA, 
that in some manner mediates, directly or indirectly, expression of a 
therapeutic product. Expression of the nucleic acid by the target cells 
within an organism afflicted with a disease or disorder thereby provides 
for modulation of the disease or disorder. The nucleic acid encoding the 

20 therapeutic product may be modified prior to introduction into the cells of 
the afflicted host in order to enhance or otherwise alter the product or 
expression thereof. 

For use in gene therapy, cells can be transfected in vitro, followed 
by introduction of the transfected cells into an organism. This is often 

25 referred to as ex vivo gene therapy. Alternatively, the cells can be 
transfected directly in vivo within an organism. 

As used herein, therapeutic agents include, but are not limited to, 
growth factors, antibodies, cytokines, such as tumor necrosis factors and 
interleukins, and cytotoxic agents and other agents disclosed herein and 
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15 



20 



known to those of skill in the art. Such agents include, but are not 
limited to, tumor necrosis factor, a-interferon, ^-interferon, nerve growth 
factor, platelet derived growth factor, tissue plasminogen activator; or, 
biological response modifiers such as, for example, lymphokines, 
interleukin- I (IL-1), interleukin-2 (IL-2), interleukin-6 (IL-6), granulocyte 
macrophage colony stimulating factor (GMCSF), granulocyte colony 
stimulating factor (G-CSF), erythropoietin (EPO), pro-coagulants such as 
tissue factor and tissue factor variants, pro-apoptotic agents such FAS- 
ligand, fibroblast growth factors (FGF), nerve growth factor and other 
growth factors. 

As used herein, a therapeutically effective product is a product that 
is encoded by heterologous DNA that, upon introduction of the DNA into 
a host, a product is expressed that effectively ameliorates or eliminates 
the symptoms, manifestations of an inherited or acquired disease or that 
cures the disease. 

As used herein, transgenic plants and animals refer to plants and 
animals in which heterologous or foreign nucleic acid is expressed or in 
which the expression of a gene naturally present in the plant or animal 
has been altered by virtue of introduction of heterologous or foreign 
nucleic acid. 

As used herein, IRES (internal ribosome entry site; see, e.g., SEQ 
ID No. 27 and nucleotides 2736-3308 SEQ ID No. 28) refers to a region 
of a nucleic acid molecule, such as an mRNA molecule, that allows 
internal ribosome entry sufficient to initiate translation, which initiation 
can be detected in an assay for cap-independent translation (see, e.g., 
U.S. Patent No. 6,171,821). The presence of an IRES within an mRNA 
molecule allows cap-independent translation of a linked protein-encoding 
sequence that otherwise would not be translated. 
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Internal ribosome entry site (IRES) elements were first identified in 
picornaviruses, which elements are considered the paradigm for cap- 
independent translation. The 5' UTRs of all picornaviruses are long and 
mediate translational initiation by directly recruiting and binding 
5 ribosomes, thereby circumventing the initial cap-binding step. IRES 
elements are frequently found in viral mRNA, they are rare in non-viral 
mRNA. Among non-viral mRNA molecules that contain functional IRES 
elements in their respective 5' UTRs are those encoding immunoglobulin 
heavy chain binding protein (BiP) (Macejak eta/. (1991) Nature 

10 353:90-94); Drosophila Antennapedia (Oh eta/. (1992) Genes Dev, 
5:1643-1653); D. Ultrabithorax (Ye et aL (1997) Mot. Cell Biol. 
7 7:1714-21); fibroblast growth factor 2 (Vagner et aL (1995) MoL Cell 
Biol. 75:35-44); initiation factor elF4G (Gan et aL (1998) J. Biol. Chem. 
273:5006-5012); proto-oncogene c-myc (Nanbru et aL (1995) J. BioL 

15 Chem. 272:32061-32066; Stoneley (1998) Oncogene 75:423-428); 

IRES H ; from the 5'UTR of NRF1 gene (Oumard et al. (2000) Mol. and Cell 
BioL, 20(8) -.2755-2759); and vascular endothelial growth factor (VEGF) 
(Stein et aL (1998) MoL Cell BioL 7ff:3112-9). 

As used herein, a promoter, with respect to a region of DNA, refers 

20 to a sequence of DNA that contains a sequence of bases that signals RNA 
polymerase to associate with the DNA and initiate transcription of RNA 
(such as pol II for mRNA) from a template strand of the DNA. A promoter 
thus generally regulates transcription of DNA into mRNA. A particular 
promoter provided herein is the Ferritin heavy chain promoter (excluding 

25 the Iron Response Element, located in the 5'UTR), which was joined to 
the 37bp Fer-1 enhancer element. This promoter is set forth as SEQ ID 
NO: 128. The endogenous Fer-1 enhancer element is located upstream of 
the Fer-1 promoter (e.g., a Fer-1 oligo was cloned proximal to the core 
promoter). 
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As used herein, isolated, substantially pure nucleic acid, such as, 
for example, DNA, refers to nucleic acid fragments purified according to 
standard techniques employed by those skilled in the art, such as that 
found in Sambrook et aL ((2001) Molecular Cloning: A Laboratory 
5 Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 
3rd edition). 

As used herein, expression refers to the transcription and/or 
translation of nucleic acid. For example, expression can be the 
transcription of a gene that may be transcribed into an RNA molecule, 

10 such as a messenger RNA (mRNA) molecule. Expression may further 
include translation of an RNA molecule and translated into peptides, 
polypeptides, or proteins. If the nucleic acid is derived from genomic 
DNA, expression may, if an appropriate eukaryotic host cell or organism is 
selected, include splicing of the mRNA. With respect to an antisense 

15 construct, expression may refer to the transcription of the antisense DNA. 

As used herein, vector or plasmid refers to discrete elements that 
are used to introduce heterologous nucleic acids into cells for either 
expression of the heterologous nucleic acid or for replication of the 
heterologous nucleic acid. Selection and use of such vectors and 

20 plasmids are well within the level of skill of the art. 

As used herein, transformation/transfection refers to the process by 
which nucleic acid is introduced into cells. The terms transfection and 
transformation refer to the taking up of exogenous nucleic acid, e.g., an 
expression vector, by a host cell whether or not any coding sequences 

25 are in fact expressed. Numerous methods of transfection are known to 
the ordinarily skilled artisan, for example, by Agrobacter/um-mediated 
transformation, protoplast transformation (including polyethylene glycol 
(PEG)-mediated transformation, electroporation, protoplast fusion, and 
microcell fusion), lipid-mediated delivery, liposomes, electroporation, 
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sonoporation, microinjection, particle bombardment and silicon carbide 
whisker-mediated transformation and combinations thereof (see, e.g., 
Paszkowski eta/. (1984) EMBO J. 5:2717-2722; Potrykus eta/. (1985) 
Mol. Gen. Genet. 795:169-177; Reich et aL (1986) B/otechno/ogy 
5 4: 1001-1 004; Klein eta/. (1987) Nature 327:70-73; U.S. Patent No. 
6,143,949; Paszkowski et at. (1989) in Ce/l Culture and Somatic Cell 
Genetics of Plants, Vol. 6, Molecular Biology of Plant Nuclear Genes, eds. 
Schell, J and Vasil, L.K. Academic Publishers, San Diego, California, p. 
52-68; and Frame et aL (1994) Plant J. 5:941-948), direct uptake using 

10 calcium phosphate (CaP04; see,e.g., Wigler et aL (1979) Proc. NatL 
Acad. Sci. U.S.A. 75:1373-1376), polyethylene glycol (PEG)-mediated 
DNA uptake, lipofection (see, e.g., Strauss (1996) Meth. Mol. Biol. 
5^:307-327), microcell fusion (see, EXAMPLES, see, also Lambert (1991) 
Proc. NatL Acad. Sci. U.S.A. 55:5907-5911; U.S. Patent No. 5,396,767, 

15 Sawford et aL (1987) Somatic Cell Mol. Genet. 73:279-284; Dhar et aL 
(1984) Somatic Cell MoL Genet. 70:547-559; and McNeill-Killary et al. 

(1995) Meth. Enzymol. 254:133-152), lipid-mediated carrier systems 
(see, e.g., Teifel et aL (1995) Biotechniques 7S:79-80; Albrecht et aL 

(1996) Ann. HematoL 72:73-79; Holmen et aL (1995) In Vitro Cell Dev. 
20 Biol. Anim. 37:347-351; Remy et al. (1994) Bioconjug. Chem. 5:647- 

654; Le Bolch et al. (1995) Tetrahedron Lett. 35:6681-6684; Loeff ler et 
aL (1993) Meth. Enzymol. 277:599-618) or other suitable method. 
Methods for delivery of ACes are described in copending U.S. application 
Serial No. 09/815,979. Successful transfection is generally recognized 
25 by detection of the presence of the heterologous nucleic acid within the 
transfected cell, such as, for example, any visualization of the 
heterologous nucleic acid or any indication of the operation of a vector 
within the host cell. 



• 



WO 02/097059 PCT/US02/17452 



-34- 

As used herein, "delivery," which is used interchangeably with 
"transfection," refers to the process by which exogenous nucleic acid 
molecules are transferred into a cell such that they are located inside the 
cell. Delivery of nucleic acids is a distinct process from expression of 
5 nucleic acids. 

As used herein, injected refers to the microinjection, such as by 
use of a small syringe, needle, or pipette, for injection of nucleic acid into 
a cell. 

As used herein, substantially homologous DNA refers to DNA that 

10 includes a sequence of nucleotides that is sufficiently similar to another 
such sequence to form stable hybrids, with each other or a reference 
sequence, under specified conditions. 

It is well known to those of skill in this art that nucleic acid 
fragments with different sequences may, under the same conditions, 

15 hybridize detectably to the same "target" nucleic acid. Two nucleic acid 
fragments hybridize detectably, under stringent conditions over a 
sufficiently long hybridization period, because one fragment contains a 
segment of at least about 10, 14 or 16 or more nucleotides in a sequence 
that is complementary (or nearly complementary) to a substantially 

20 contiguous sequence of at least one segment in the other nucleic acid 
fragment. If the time during which hybridization is allowed to occur is 
held constant, at a value during which, under preselected stringency 
conditions, two nucleic acid fragments with complementary base-pairing 
segments hybridize detectably to each other, departures from exact 

25 complementarity can be introduced into the base-pairing segments, and 
base-pairing will nonetheless occur to an extent sufficient to make 
hybridization detectable. As the departure from complementarity between 
the base-pairing segments of two nucleic acids becomes larger, and as 
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conditions of the hybridization become more stringent, the probability 
decreases that the two segments will hybridize detectably to each other. 

Two single-stranded nucleic acid segments have "substantially the 
same sequence", if (a) both form a base-paired duplex with the same 
5 segment, and (b) the melting temperatures of the two duplexes in a 

solution of 0.5 X SSPE differ by less than lO^C. If the segments being 
compared have the same number of bases, then to have "substantially 
the same sequence", they will typically differ in their sequences at fewer 
than 1 base in 10. Methods for determining melting temperatures of 

10 nucleic acid duplexes are well known (see, e.g., Meinkoth eta/. (1984) 
Anal. Biochem. 738:267-284 and references cited therein). 

As used herein, a nucleic acid probe is a DNA or RNA fragment 
that includes a sufficient number of nucleotides to specifically hybridize to 
DNA or RNA that includes complementary or substantially complementary 

15 sequences of nucleotides. A probe may contain any number of 

nucleotides, from as few as about 10 and as many as hundreds of 
thousands of nucleotides. The conditions and protocols for such 
hybridization reactions are well known to those of skill in the art as are 
the effects of probe size, temperature, degree of mismatch, salt 

20 concentration and other parameters on the hybridization reaction. For 
example, the lower the temperature and higher the salt concentration at 
which the hybridization reaction is carried out, the greater the degree of 
mismatch that may be present in the hybrid molecules. 

To be used as a hybridization probe, the nucleic acid is generally 

25 rendered detectable by labeling it with a detectable moiety or label, such 
as 32 P, 3 H and 14 C, or by other means, including chemical labeling, such 
as by nick-translation in the presence of deoxyuridylate biotinylated at the 
5'-position of the uracil moiety. The resulting probe includes the 
biotinylated uridylate in place of thymidylate residues and can be detected 
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(via the biotin moieties) by any of a number of commercially available 
detection systems based on binding of streptavidin to the biotin. Such 
commercially available detection systems can be obtained, for example, 
from Enzo Biochemicals, Inc. (New York, NY). Any other label known to 
5 those of skill in the art, including non-radioactive labels, may be used as 
long as it renders the probes sufficiently detectable, which is a function of 
the sensitivity of the assay, the time available (for culturing cells, 
extracting DNA, and hybridization assays), the quantity of DNA or RNA 
available as a source of the probe, the particular label and the means used 

10 to detect the label. 

Once sequences with a sufficiently high degree of homology to the 
probe are identified, they can readily be isolated by standard techniques 
(see, e.g., Sambrook et al. (2001) Molecular Cloning: A Laboratory 
Manual, 3rd Edition, Cold Spring Harbor Laboratory Press). 

15 As used herein, conditions under which DNA molecules form stable 

hybrids are considered substantially homologous, and a DNA or nucleic 
acid homolog refers to a nucleic acid that includes a preselected 
conserved nucleotide sequence, such as a sequence encoding a 
polypeptide. By the term "substantially homologous" is meant having at 

20 least 75%, preferably 80%, preferably at least 90%, most preferably at 
least 95% homology therewith or a less percentage of homology or 
identity and conserved biological activity or function. 

The terms "homology" and "identity" are often used 
interchangeably. In this regard, percent homology or identity may be 

25 determined, for example, by comparing sequence information using a GAP 
computer program. The GAP program utilizes the alignment method of 
Needleman and Wunsch (J. Mol. Biol. 48:443 (1970), as revised by Smith 
and Waterman (Adv. Appl. Math. 2:482 (1981). Briefly, the GAP 
program defines similarity as the number of aligned symbols (i.e., 



WO 02/097059 



PCT/US02/17452 



-37- 

nucleotides or amino acids) which are similar, divided by the total number 
of symbols in the shorter of the two sequences. The preferred default 
parameters for the GAP program may include: (1) a unary comparison 
matrix (containing a value of 1 for identities and 0 for non-identities) and 
5 the weighted comparison matrix of Gribskov and Burgess, NucL Acids 
Res. 14:6745 (1986), as described by Schwartz and Dayhoff, eds., 
A TLAS OF PROTEIN SEQUENCE AND STRUCTURE, National Biomedical 
Research Foundation, pp. 353-358 (1979); (2) a penalty of 3.0 for each 
gap and an additional 0.10 penalty for each symbol in each gap; and (3) 

10 no penalty for end gaps. 

By sequence identity, the number of conserved amino acids are 
determined by standard alignment algorithms programs, and are used with 
default gap penalties established by each supplier. Substantially 
homologous nucleic acid molecules would hybridize typically at moderate 

1 5 stringency or at high stringency all along the length of the nucleic acid of 
interest. Preferably the two molecules will hybridize under conditions of 
high stringency. Also contemplated are nucleic acid molecules that 
contain degenerate codons in place of codons in the hybridizing nucleic 
acid molecule. 

20 Whether any two nucleic acid molecules have nucleotide sequences 

that are at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99% 
"identical" can be determined using known computer algorithms such as 
the "FAST A" program, using for example, the default parameters as in 
Pearson and Lipman, Proc. Natl. Acad. Sci. USA £5:2444 (1988). 

25 Alternatively the BLAST function of the National Center for Biotechnology 
Information database may be used to determine relative sequence 
identity. 

In general, sequences are aligned so that the highest order match 
is obtained. "Identity" per se has an art-recognized meaning and can be 
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calculated using published techniques. (See, e.g.: Computational 
Molecular Biology, Lesk # A.M., ed., Oxford University Press, New York, 
1988; Biocomputing: Informatics and Genome Projects, Smith, D.W., ed., 
Academic Press, New York, 1 993; Computer Analysis of Sequence Data, 
5 Part I, Griffin, A.M., and Griffin, H.G., eds., Humana Press, New Jersey, 
1994; Sequence Analysis in Molecular Biology, von Heinje, G., Academic 
Press, 1987; and Sequence Analysis Primer, Gribskov, M. and Devereux, 
J., eds., M Stockton Press, New York, 1991). While there exist a number 
of methods to measure identity between two polynucleotide or 

10 polypeptide sequences, the term "identity" is well known to skilled 

artisans (Carillo, H. & Upton, D., SI AM J Applied Math -4S:1073 (1988)). 
Methods commonly employed to determine identity or similarity between 
two sequences include, but are not limited to, those disclosed in Guide to 
Huge Computers, Martin J. Bishop, ed., Academic Press, San Diego, 

15 1994, and Carillo, H. & Lipton, D., SIAM J Applied Math 48:1073 
(1988). Methods to determine identity and similarity are codified in 
computer programs. Preferred computer program methods to determine 
identity and similarity between two sequences include, but are not limited 
to, GCG program package (Devereux, J., et aL, Nucleic Acids Research 

20 12(f)\3S7 (1984)), BLASTP, BLASTN, FASTA (Atschul, S.F., era/., J 
Molec Biol 215:403 (1990)). 

Therefore, as used herein, the term "identity" represents a 
comparison between a test and a reference polypeptide or polynucleotide. 
For example, a test polypeptide may be defined as any polypeptide that is 

25 90% or more identical to a reference polypeptide. 

As used herein, the term at least "90% identical to" refers to 
percent identities from 90 to 99.99 relative to the reference polypeptides. 
Identity at a level of 90% or more is indicative of the fact that, assuming 
for exemplification purposes a test and reference polynucleotide length of 
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100 amino acids are compared. No more than 10% (i.e., 10 out of 100) 
amino acids in the test polypeptide differs from that of the reference 
polypeptides. Similar comparisons may be made between a test and 
reference polynucleotides. Such differences may be represented as point 
5 mutations randomly distributed over the entire length of an amino acid 
sequence or they may be clustered in one or more locations of varying 
length up to the maximum allowable, e.g. 10/100 amino acid difference 
(approximately 90% identity). Differences are defined as nucleic acid or 
amino acid substitutions, or deletions. 
lO As used herein: stringency of hybridization in determining 

percentage mismatch encompass the following conditions or equivalent 
conditions thereto: 

1) high stringency: 0.1 x SSPE or SSC, 0.1 % SDS, 65 °C 

2) medium stringency: 0.2 x SSPE or SSC, 0.1 % SDS, 50°C 
15 3) low stringency: 1 .0 x SSPE or SSC, O.I % SDS, 50°C 

or any combination of salt and temperature and other reagents that result 
in selection of the same degree of mismatch or matching. Equivalent 
conditions refer to conditions that select for substantially the same 
percentage of mismatch in the resulting hybrids. Additions of ingredients, 

20 such as formamide, Ficoll, and Denhardt's solution affect parameters such 
as the temperature under which the hybridization should be conducted 
and the rate of the reaction. Thus, hybridization in 5 X SSC, in 20% 
formamide at 42° C is substantially the same as the conditions recited 
above hybridization under conditions of low stringency. The recipes for 

25 SSPE, SSC and Denhardt's and the preparation of deionized formamide 
are described, for example, in Sambrook eta/. (1989) Molecular Cloning, 
A Laboratory Manual, Cold Spring Harbor Laboratory Press, Chapter 8; 
see, Sambrook etal., vol. 3, p. B.13, see, also, numerous catalogs that 
describe commonly used laboratory solutions. It is understood that 
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equivalent stringencies may be achieved using alternative buffers, salts 
and temperatures. As used herein, all assays and procedures, such as 
hybridization reactions and antibody-antigen reactions, unless otherwise 
specified, are conducted under conditions recognized by those of skill in 
5 the art as standard conditions. 

As used herein, conservative amino acid substitutions, such as 
those set forth in Table 1, are those that do not eliminate biological 
activity. Suitable conservative substitutions of amino acids are known to 
those of skill in this art and may be made generally without altering the 

lO biological activity of the resulting molecule. Those of skill in this art 

recognize that, in general, single amino acid substitutions in non-essential 
regions of a polypeptide do not substantially alter biological activity (see, 
e.g., Watson et aL Molecular Biology of the Gene, 4th Edition, 1 987 / The 
Bejacmin/Cummings Pub. co., p. 224). Conservative amino acid 

1 5 substitutions are made, for example, in accordance with those set forth in 
TABLE 1 as follows: 

TABLE 1 

Original residue 
Ala (A) 

20 



35 



Original residue 


Conservative substitution 


Ala (A) 


Gly; Ser, Abu 


Arg (R) 


Lys, orn 


Asn (N) 


Gin; His 


Cys (C) 


Ser 


Gin (Q) 


Asn 


Glu (E) 


Asp 


Gly (G) 


Ala; Pro 


His (H) 


Asn; Gin 


He (1) 


Leu; Val; Met; Nle; Nva 


Leu (L) 


lie; Val; Met; Nle; Nva 


Lys (K) 


Arg; Gin; Glu 


Met (M) 


Leu; Tyr; lie; NLe Val 


Ornithine 


Lys; Arg 


Phe <F) 


Met; Leu; Tyr 


Ser (S) 


Thr 


Thr (T) 


Ser 


Trp (W) 


Tyr 


Tyr (Y) 


Trp; Phe 


Val (V) 


He; Leu; Met; Nle; Nva 
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Other substitutions are also permissible and may be determined 
empirically or in accord with known conservative substitutions. 

As used herein, the amino acids, which occur in the various amino 
acid sequences appearing herein, are identified according to their well- 
5 known, three-letter or one-letter abbreviations. The nucleotides, which 
occur in the various DNA fragments, are designated with the standard 
single-letter designations used routinely in the art. 

As used herein, a splice variant refers to a variant produced by 
differential processing of a primary transcript of genomic DNA that results 
10 in more than one type of mRNA. 

As used herein, a probe or primer based on a nucleotide sequence 
includes at least 10, 14, 16, 30 or 100 contiguous nucleotides from the 
reference nucleic acid molecule. 

As used herein, recombinant production by using recombinant DNA 
1 5 methods refers to the use of the well known methods of molecular 
biology for expressing proteins encoded by cloned DNA. 

As used herein, biological activity refers to the in vivo activities of 
a compound or physiological responses that result upon in vivo 
administration of a compound, composition or other mixture. Biological 
20 activity, thus, encompasses therapeutic effects and pharmaceutical 
activity of such compounds, compositions and mixtures. Biological 
activities may be observed in in vitro systems designed to test or use 
such activities. Thus, for purposes herein the biological activity of a 
luciferase is its oxygenase activity whereby, upon oxidation of a 
25 substrate, light is produced. 
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The terms substantially identical or similar varies with the context 
as understood by those skilled in the relevant art and generally means at 
least 40, 60, 80, 90, 95 or 98%. 

As used herein, substantially identical to a product means 
5 sufficiently similar so that the property is sufficiently unchanged so that 
the substantially identical product can be used in place of the product. 

As used herein, substantially pure means sufficiently homogeneous 
to appear free of readily detectable impurities as determined by standard 
methods of analysis, such as thin layer chromatography (TLC), gel 

10 electrophoresis and high performance liquid chromatography (HPLC), used 
by those of skill in the art to assess such purity, or sufficiently pure such 
that further purification would not detectably alter the physical and 
chemical properties, such as enzymatic and biological activities, of the 
substance. Methods for purification of the compounds to produce 

1 5 substantially chemically pure compounds are known to those of skill in 
the art. A substantially chemically pure compound may, however, be a 
mixture of stereoisomers or isomers. In such instances, further 
purification might increase the specific activity of the compound. 

As used herein, vector (or plasmid) refers to discrete elements that 

20 are used to introduce heterologous DNA into cells for either expression or 
replication thereof. The vectors typically remain episomal, but may be 
designed to effect integration of a gene or portion thereof into a 
chromosome of the genome. Also contemplated are vectors that are 
artificial chromosomes, such as yeast artificial chromosomes and 

25 mammalian artificial chromosomes. Selection and use of such vehicles 

are well known to those of skill in the art. An expression vector includes 
vectors capable of expressing DNA that is operatively linked with 
regulatory sequences, such as promoter regions, that are capable of 
effecting expression of such DNA fragments. Thus, an expression vector 
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refers to a recombinant DNA or RNA construct, such as a plasmid, a 
phage, recombinant virus or other vector that, upon introduction into an 
appropriate host cell, results in expression of the cloned DNA. 
Appropriate expression vectors are well known to those of skill in the art 
5 and include those that are replicable in eukaryotic cells and/or prokaryotic 
cells and those that remain episomal or those which integrate into the 
host cell genome. 

As used herein, protein-binding-sequence refers to a protein or 
peptide sequence that is capable of specific binding to other protein or 
10 peptide sequences generally, to a set of protein or peptide sequences or 
to a particular protein or peptide sequence. 

As used herein, a composition refers to any mixture of two or more 
ingredients. It may be a solution, a suspension, liquid, powder, a paste, 
aqueous, non-aqueous or any combination thereof. 
1 5 As used herein, a combination refers to any association between 

two or more items. 

As used herein, fluid refers to any composition that can flow. 
Fluids thus encompass compositions that are in the form of semi-solids, 
pastes, solutions, aqueous mixtures, gels, lotions, creams and other such 
20 compositions. 

As used herein, a cellular extract refers to a preparation or fraction 
that is made from a lysed or disrupted cell. 

As used herein, the term "subject" refers to animals, plants, 
insects, and birds and other phyla, genera and species into which nucleic 
25 acid molecules may be introduced. Included are higher organisms, such 
as mammals, fish, insects and birds, including humans, primates, cattle, 
pigs, rabbits, goats, sheep, mice, rats, guinea pigs, hamsters, cats, dogs, 
horses, chicken and others. 
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As used herein, flow cytometry refers to processes that use a laser 
based instrument capable of analyzing and sorting out cells and or 
chromosomes based on size and fluorescence. 

As used herein, the abbreviations for any protective groups, amino 
5 acids and other compounds, are, unless indicated otherwise, in accord 
with their common usage, recognized abbreviations, or the IUPAC-IUB 
Commission on Biochemical Nomenclature (see, (1972) Biochem. 
7 7:942-944). 

B. Recombination systems 

1 0 Site-specific recombination systems typically contain three 

elements: a pair of DNA sequences (the site-specific recombination 
sequences) and a specific enzyme (the site-specific recombinase). The 
site-specific recombinase catalyzes a recombination reaction between two 
site-specific recombination sequences. 

15 A number of different site-specific recombinase systems are 

available and/or known to those of skill in the art, including, but not 
limited to: the Cre//ox recombination system using CRE recombinase (see, 
e.g., SEQ ID Nos. 58 and 59) from the Escherichia coli phage P1 (see, 
e.g., Sauer (1993) Methods in Enzymology 225:890-900; Sauer et a/. 

20 (1990) The New Biologist 2:441 -449), Sauer (1994) Current Opinion in 
Biotechnology 5:521-527; Odell et at. (1990) Mo/ Gen Genet. 223:369- 
378; Lasko efa/. (1992) Proc. Natl. Acad. Sci. U.S.A. £3:6232-6236; 
U.S. Patent No. 5,658,772), the FLP/FRT system of yeast using the FLP 
recombinase (see, SEQ ID Nos. 60 and 61) from the 2// episome of 

25 Saccharomyces cerevisiae (Cox (1983) Proc. Natl. Acad. Sci. U.S.A. 
#0:4223; Falco et ai. (1982) Cell 23:573-584; Golic et aL (1989) 
Ce//59:499-509; U.S. Patent No. 5,744,336), the resolvases, including 
Gin recombinase of phage Mu (Maeser et al. (1991) Moi Gen Genet. 
250:170-176; Klippel, A. et al (1993) EMBO J. 72:1047-1057; see, e.g., 
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SEQ ID Nos. 64-67), Cin, Hin, aS Tn3; the Pin recombinase of E. co/i 
(see, e.g., SEQ ID Nos. 68 and 69; Enomoto eta/. (1983) J Bacteriol. 
5:663-668), the R/RS system of the pSR1 plasmid of Zygosaccharomyces 
rouxii (Araki eta/. (1992) J. Mo/. Biol. 225:25-37; Matsuzaki eta/. (1990) 
5 J. Bacteriol. 172\ 610-618) and site-specific recombinases from 

Kluyveromyces drosophilarium (Chen eta/. (1986) Nucleic Acids Res. 
37^:4471-4481) and Kluyveromyces waltii (Chen eta/. (1992) J. Gen. 
Microbioi. 735:337-345). Other systems are known to those of skill in 
the art (Stark eta/. Trends Genet. 3:432-439; Utatsu eta/. (1987) J. 

10 Bacteriol. 7^3:5537-5545; see, also, U.S. Patent No. 6,171,861). 

Members of the highly related family of site-specific recombinases, 
the resolvase family, such as y5, Tn3 resolvase, Hin, Gin, and Cin are also 
available. Members of this family of recombinases are typically 
constrained to intramolecular reactions (e.g., inversions and excisions) 

1 5 and can require host-encoded factors. Mutants have been isolated that 
relieve some of the requirements for host factors (Maeser et a/. (1991) 
Mo/. Gen. Genet. 230:170-176), as well as some of the constraints 
of intramolecular recombination (see, U.S. Patent No. 6,171,861). 

The bacteriophage P1 Cre/lox and the yeast FLP/FRT systems are 

20 particularly useful systems for site-specific integration, inversion or 
excision of heterologous nucleic acid into, and out of, chromosomes, 
particularly ACes as provided herein. In these systems a recombinase 
(Cre or FLP) interacts specifically with its respective site-specific 
recombination sequence (lox or FRT, respectively) to invert or excise the 

25 intervening sequences. The sequence for each of these two systems is 
relatively short (34 bp for lox and 47 bp for FRT). 

The FLP/FRT recombinase system has been demonstrated to 
function efficiently in plant cells (U.S. Patent No. 5,744,386), and, thus, 
can be used for producing plant artificial chromosome platforms. In 



WO 02/097059 



PCT/US02/17452 



-46- 

general, short incomplete FRT sites leads to higher accumulation of 
excision products than the complete full-length FRT sites. The system 
catalyzes intra- and intermolecular reactions, and, thus, can be used for 
DNA excision and integration reactions. The recombination reaction is 
5 reversible and this reversibility can compromise the efficiency of the 
reaction in each direction. Altering the structure of the site-specific 
recombination sequences is one approach to remedying this situation. 
The site-specific recombination sequence can be mutated in a manner 
that the product of the recombination reaction is no longer recognized as 

10 a substrate for the reverse reaction, thereby stabilizing the integration or 
excision event. 

In the Cre-lox system, discovered in bacteriophage PI, 
recombination between loxP sites occurs in the presence of the Cre 
recombinase (see, e.g., U.S. Patent No. 5,658,772). This system can be 

15 used to insert, invert or excise nucleic acid located between two lox sites. 
Cre can be expressed from a vector. Since the lox site is an asymmetrical 
nucleotide sequence, lox sites on the same DNA molecule can have the 
same or opposite orientation with respect to each other. Recombination 
between lox sites in the same orientation results in a deletion of the DNA 

20 segment located between the two lox sites and a connection between the 
resulting ends of the original DNA molecule. The deleted DNA segment 
forms a circular molecule of DNA. The original DNA molecule and the 
resulting circular molecule each contain a single lox site. Recombination 
between lox sites in opposite orientations on the same DNA molecule 

25 result in an inversion of the nucleotide sequence of the DNA segment 
located between the two lox sites. In addition, reciprocal exchange of 
DNA segments proximate to lox sites located on two different DNA 
molecules can occur. All of these recombination events are catalyzed by 
the product of the Cre coding region. 
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Any site-specific recombinase system known to those of skill in the 
art is contemplated for use herein. It is contemplated that one or a 
plurality of sites that direct the recombination by the recombinase are 
introduced into an artificial chromosome to produce platform ACes. The 
5 resulting platform ACes are introduced into cells with nucleic acid 

encoding the cognate recombinase, typically on a vector, and nucleic acid 
encoding heterologous nucleic acid of interest linked to the appropriate 
recombination site for insertion into the platform ACes. The recombinase- 
encoding-nucleic acid may be introduced into the cells on the same 

10 vector, or a different vector, encoding the heterologous nucleic acid. 

An E. co/i phage lambda integrase system for ACes platform 
engineering and for artificial chromosome engineering is provided (Lorbach 
et a/. (2000) J. Mo/. Biol 296: 1 175-1 181). The phage lambda integrase 
(Landy, A. (1989) Anna. Rev. Biochem. 55:913-94) is adapted herein and 

1 5 the cognate att sites are provided. Chromosomes, including ACes, 

engineered to contain one or a plurality of att sites are provided, as are 
vectors encoding a mutant integrase that functions in the absence other 
factors. Methods using the modified chromosomes and vectors for 
introduction of heterologous nucleic acid are also provided. 

20 For purposes herein, one or more of the sites (e.g., a single site or 

a pair of sites) required for recombination are introduced into an artificial 
chromosome, such as an ACes chromosome. The enzyme for catalyzing 
site-directed recombination is introduced with the DNA of interest, or 
separately, or is engineered onto the artificial chromosome under the 

25 control of a regulatable promoter. 

As described herein, artificial chromosome platforms containing one 
or multiple recombination sites are provided. The methods and resulting 
products are exemplified with the lambda phage Att/lnt system, but 
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similar methods may be used for production of ACes platforms with other 
recombination systems. 

The Att/lnt system and vectors provided herein are not only 
intended for engineering ACes platforms, but may be used to engineer an 
5 Att/lnt system into any chromosome. Introduction of att sites into a 

chromosome will permit engineering of natural chromosomes, such as by 
permitting targeted integration genes or regulatory regions, and by 
controlled excision of selected regions. For example, genes encoding a 
particular trait may be added to a chromosome, such as plant 
lO chromosome engineered to contain one or plurality of att sites. Such 
chromosomes may be used for screening DNA to identify genes. Large 
pieces of DNA can be introduced into cells and the cells screened 
phenotypically to select those having the desired trait. 
C. Platforms 

15 Provided herein are platform artificial chromosomes (platform ACes) 

containing single or multiple site-specific recombination sites. 
Chromosome-based platform technology permits efficient and tractable 
engineering and subsequent expression of multiple gene targets. Methods 
are provided that use DNA vectors and fragments to create platform 

20 artificial chromosomes, including animal, particularly mammalian, artificial 
chromosomes, and plant artificial chromosomes. The artificial 
chromosomes contain either single or multiple sequence-specific 
recombination sites suitable for the placement of target gene expression 
vectors onto the platform chromosome. The engineered chromosome- 

25 based platform ACes technology is applicable for methods, including 
cellular and transgenic protein production, transgenic plant and animal 
production and gene therapy. The platform ACes are also useful for 
producing a library of ACes comprising random portions of a given 
genome (e.g., a mammalian, plant or prokaryotic genome) for genomic 
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screening; as well as a library of cells comprising different and/or mutually 
exclusive ACes therein. 

Exemplary of artificial chromosome platforms are those based on 
ACes. ACes artificial chromosomes are non-viral, self-replicating nucleic 
5 acid molecules that function as a natural chromosome, having all the 
elements required for normal chromosomal replication and maintenance 
within the cell nucleus. ACes artificial chromosomes do not rely on 
integration into the genome of the cell to be effective, and they are not 
limited by DNA carrying capacity and as such the therapeutic gene(s) of 

1 0 interest, including regulatory sequences, can be engineered into the ACes. 
In addition, ACes are stable in vitro and in vivo and can provide 
predictable long-term gene expression. Once engineered and delivered to 
the appropriate cell or embryo, ACes work independently alongside host 
chromosomes, for ACes that are predominantly heterochromatin 

1 5 producing only the products (proteins) from the genes it carries. As 

provided herein ACes are modified by introduction of recombination site(s) 
to provide a platform for ready introduction of heterologous nucleic acid. 
The ACes platforms can be used for production of transgenic animals and 
plants; as vectors for genetic therapy; for use as protein production 

20 systems; for animal models to identify and target new therapeutics; in cell 
culture for the development and production of therapeutic proteins; and 
for a variety of other applications. 

1 . Generation of artificial chromosomes 

Artificial chromosomes may be generated by any method known to 
25 those of skill in the art. Of particular interest herein are the ACes artificial 
chromosomes, which contain a repeated unit. Methods for production of 
ACes are described in detail in U.S. Patent Nos. 6,025,155 and 
6,077,697, which, as with all patents, applications, publications and 
other disclosure, are incorporated herein in their entirety. 
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Generation of de novo ACes. 

ACes can be generated by cotransfecting exogenous DNA— such 
as a mammary tissue specific DNA cassette including the gene sequences 
for a therapeutic protein, with a rDNA fragment and a drug resistance 
5 marker gene into the desired eukaryotic cell, such as plant or animal cells, 
such as murine cells in vitro. DNA with a selectable or detectable marker 
is introduced, and can be allowed to integrate randomly into pericentric 
heterochromatin or can be targeted to pericentric heterochromatin, such 
as that in rDNA gene arrays that reside on acrocentric chromosomes, 

10 such as the short arms of acrocentric chromosomes- This integration 
event activates the "megareplicator" sequence and amplifies the 
pericentric heterochromatin and the exogenous DNA, and duplicates a 
centromere. Ensuing breakage of this "dicentric'' chromosome can result 
in the production of daughter cells that contain the substantially-original 

15 chromosome and the new artificial chromosome. The resulting ACes 
contain all the essential elements needed for stability and replication in 
dividing cells — centromere, origins of replications, and telomeres. ACes 
have been produced that express marker genes (lacZ, green fluorescent 
protein, neomycin-resistance, puromycin-resistance, hygromycin- 

20 resistance) and genes of interest. Isolated ACes, for example, have been 
successfully transferred intact to rodent, human, and bovine cells by 
electroporation, sonoporation, microinjection, and transfection with lipids 
and dendrimers. 

To render the creation of ACes with desired genes more tractable 
25 and efficient, "platform" ACes (platform-^ Ces) can be produced that 

contain defined DNA sequences for enzyme-mediated homologous DNA 
recombination, such as by Cre or FLP recombinases (Bouhassira et ai. 
(1996) Biood 88(suppiement 7^:1 90a; Bouhassira et ai. (1997) Blood r 
30:3332-3344; Siebler pt ml. (1997) Biochemistry: 3^:1740-1747; 
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Siebler et al. (1998) Biochemistry 37: 6229-6234; and Bethke et aL 
(1997) Nucl. Acids Res. 25:2828-2834), and as exemplified herein the 
lambda phage integrase. A /ox site contains two 1 3 bp inverted repeats 
to which Cre-recombinase binds and an intervening 8 bp core region. 
5 Only pairs of sites having identity in the central 6 bp of the core region 
are proficient for recombination; sites having non-identical core sequences 
(heterospecific /ox sites) do not efficiently recombine with each other 
(Hoess et at. (1986) Nucleic Acids Res. 7^:2287-2300). 



In human and mouse cells de novo formation of a satellite DNA 
based artificial chromosome (SATAC, also referred to as ACes) can occur 
in an acrocentric chromosome where the short arm contains only 
pericentric heterochromatin, the rDNA array, and telomere sequences. 

1 5 Plant species may not have any acrocentric chromosomes with the same 
physical structure described, but "megareplicator" DNA sequences reside 
in the plant rDNA arrays, also known as the nucleolar organizing regions 
(NOR). A structure like those seen in acrocentric mammalian 
chromosomes can be generated using site-specific recombination between 

20 appropriate arms of plant chromosomes- 
Approach 

Qin eta/. ((1994) Proc. Nati. Acad. Sci. U.S.A. ,9 7:1706-1710, 
1994) describes crossing two Nicotiana tabacum transgenic plants. One 
plant contains a construct encoding a promoterless hygrornycin-resistance 
25 gene preceded by a /ox site (lox-hpt), the other plant carries a construct 
containing a cauliflower mosaic virus 35S promoter linked to a /ox 
sequence and the ere DNA recombinase coding region (35S-/ox-cre) . The 
constructs were introduced separately by infecting leaf explants with 
agrobacterium tumefaciens which carries the kanamycin-resistance gene 



10 



Generating acrocentric chromosomes for plant 
artificial chromosome formation. 
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(Kan R ). The resultant Kan R transgenic plants were crossed. Plants that 
carried the appropriate DNA recombination event were identified by 
hygromycin-resistance. 

5 Modification of the above for generation of ACes 

The Kan R cultivars are initially screened, such as by FISH, to 

identify two sets of candidate transgenic plants. One set has one 

construct integrated in regions adjacent to the pericentric heterochromatin 

on the short arm of any chromosome. The second set of candidate plants 

1 0 has the other construct integrated in the NOR region of appropriate 

chromosomes. To obtain reciprocal translocation both sites must be in 

the same orientation. Therefore a series of crosses are required, Kan R 

plants generated, and FISH analyses performed to identify the appropriate 

"acrocentric" plant chromosome for de novo plant ACes formation. 

15 2. Bacteriophage lambda integrase-based site-specific 

recombination system 

An integral part of the platform technology includes a site-specific 

recombination system that allows the placement of selected gene targets 

or genomic fragments onto the platform chromosomes. Any such system 

20 may be used. In particular, a method is provided for insertion of 

additional DNA fragments into the platform chromosome residing in the 
cell via sequence-specific recombination using the recombinase activity of 
the bacteriophage lambda integrase. The lambda integrase system is 
exemplary of the recombination systems contemplated for ACes. Any 

25 known recombination system, including any described herein, particularly 
any that operates without the need for additional factors or that, by virtue 
of mutation, does not require additional factors, is contemplated. 
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As noted the lambda integrase system provided herein can be used 

with natural chromosomes and artificial chromosomes in addition to 

ACes. Single or a plurality of recombination sites, which may be the 

same or different, are introduced into artificial chromosomes to produce 

5 artificial chromosome platforms. 

3. Creation of bacteriophage lambda integrase site-specific 
recombination system 

The lambda phage-encoded integrase (designated Int) is a 

prototypical member of the integrase family. Int effects integration and 

lO excision of the phage in and out of the E. coli genome via recombination 
between pairs of attachment sites designated attB/attP and attL/attR. 
Each att site contains two inverted 9 base pair core Int binding sites and a 
7 base pair overlap region that is identical in wild-type att sites. Each 
site, except for attB contains additional Int binding sites. In flanking 

15 regions, there are recognition sequences for accessory DNA binding 
proteins, such as integration host factor (IHF), factor for inversion 
stimulation (FIS) and the phage encoded excision protein (XIS). Except 
for attB, Int is a heterobivalent DNA-binding protein and, with assistance 
from the accessory proteins and negative DNA supercoiling, binds 

20 simultaneously to core and arm sites within the same att site. 

Int, like Cre and FLP, executes an ordered sequential pair of strand 
exchanges during integrative and excisive recombination. The natural 
pairs of target sequences for Int, attB and attP or attL and attR are 
located on the same or different DNA molecules resulting in intra or 

25 intermolecular recombination, respectively. For example, intramolecular 
recombination occurs between inversely oriented attB and attP, or 
between attL and attR sequences, respectively, leading to inversion of the 
intervening DNA segment. 
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Like the recombinase systems, such as Cre and FLP, Int directs 
site-specific recombination. Unlike the other systems, such Cre and FLP, 
Int generally requires additional protein factors for integrative and excisive 
recombination and negative supercoiling for integrative recombination. 
5 Hence, the Int system had not been used in eukaryotic targeting systems. 
Mutant Int proteins, designated Int-h (E174K) and a derivative 
thereof lnt-h/21 8(E1 74K/E21 8K) do not require accessory proteins to 
perform intramolecular integrative and excisive recombination in co- 
transfection assays in human cells (Lorbach et al. (2000) J Mol. Biol. 

10 296'A 175-1 181); wild-type Int does not catalyze intramolecular 
recombination in human cells harboring target sites attB and attP. 
Hence it had been demonstrated that mutant Int can catalyze factor- 
independent recombination events in human cells. 

There has been no demonstration by others that this system can be 

1 5 used for engineering of eukaryotic genomes or chromosomes. Provided 
herein are chromosomes, including artificial chromosomes, such as but 
not limited to ACes that contain att sites (e.g., platform ACes), and the 
use of such chromosomes for targeted integration of heterologous DNA 
into such chromosomes in eukaryotic cells, including animal, such as 

20 rodent and human, and plant cells. Mutant Int provided herein is shown 
to effect site-directed recombination between sites in artificial 
chromosomes and vectors containing cognate sites. 

An additional component of the chromosome-based platform 
technology is the site-specific integration of target DNA sequences onto 

25 the platform. For this the native bacteriophage lambda integrase has 
been modified to carry out this sequence specific DNA recombination 
event in eukaryotic cells. The bacteriophage lambda integrase and its 
cognate DNA substrate att is a member of the site-specific recombinase 
family that also includes the bacteriophage PI Cre/lox system as well as 
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the Saccharomyces cerevisfae 2 micron based FLP/FRT system (see, e.g. , 
Landy (1989) Ann. Rev. Biochem 55:913-949; Hoess eta/. (1982) Proc. 
Natl. Acad. Set. U.S.A. 75:3398-3402; Broach et al. (1982) Cell 29:227- 
234). 

5 By combining DNA endonuclease and DNA ligase activity these 

recombinases recognize and catalyze DNA exchanges between sequences 
flanking the recognition site. During the integration of lambda genome 
into the E. coif (lambda recombination) genome, the phage integrase (INT) 
in association with accessory proteins catalyzes the DNA exchange 

10 between the attP site of the phage genome and the attB site of the 

bacterial genome resulting in the formation of attL and attR sites (Figure 
6). The engineered bacteriophage lambda integrase has been produced 
herein to carry out an intermolecular DNA recombination event between 
an incoming DNA molecule (primarily on a vector containing the bacterial 

15 attB site) and the chromosome-based platform carrying the lambda attP 
sequence independent of lambda bacteriophage or bacterial accessory 
proteins. 

In contrast to the bi-directional Cre/lox and FLP/FRT system, the 

engineered lambda recombination system derived for chromosome-based 

20 platform technology is advantageously unidirectional because accessory 

proteins, which are absent, are required for excision of integrated nucleic 

acid upon further exposure to the lambda Int recombinase. 

4. Creation of platform chromosome containing single or 
multiple sequence-specific recombination sites 

25 a. Multiple sites 

For the creation of a platform chromosome containing multiple, 

sequence-specific recombination sites, artificial chromosomes are 

produced as depicted in Figure 5 and Example 3. As discussed above, 

artificial chromosomes can be produced using any suitable methodology, 
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including those described in U.S. Patent Nos. 5,288,625; 5,712,134; 
5,891,691; 6,025,155. Briefly, to prepare artificial chromosomes 
containing multiple recombination (e.g., integration) sites, nucleic acid 
(either in the form a one or more plasmids, such as the plasmid 
5 pSV40193attPsensePUR set forth in Example 3) is targeted into an 

amplifiable region of a chromosome, such as the pericentric region of a 
chromosome. Among such regions are the rDNA gene loci in acrocentric 
mammalian chromosomes. Hence, targeting nucleic acid for integration 
into the rDNA region of mammalian acrocentric chromosomes can include 

1 O the mouse rDNA fragments (for targeting into rodent cell lines) or large 
human rDNA regions on BAC/PAC vectors (or subclones thereof in 
standard vectors) for targeting into human acrocentric chromosomes, 
such as for human gene therapy applications. The targeting nucleic acid 
generally includes a detectable or selectable marker, such as antibiotic 

1 5 resistance, such as puromycin and hygromycin, a recombination site 
(such as attP, attB, attL, attR or the like), and/or human selectable 
markers as required for gene therapy applications. Cells are grown under 
conditions that result in amplification and ultimately production of ACes 
artificial chromosomes having multiple recombination (e.g. integration) 

20 sites therein. ACes having the desired size are selected for further 
engineering. 

b. Creation of platform chromosome containing a 
single sequence-specific recombination site 

In this method a mammalian platform artificial chromosome is 

25 generated containing a single sequence-specific recombination site. In 

the Example below, this approach is demonstrated using a puromycin 

resistance marker for selection and a mouse rDNA fragment for targeting 

into the rDNA locus on mouse acrocentric chromosomes. Other selection 

markers and targeting DNA sequences as desired and known to those of 
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skill in the art can be used. Additional resistance markers include genes 
conferring resistance to the antibiotics neomycin, blasticidin, hygromycin 
and zeocin. For applications, such as gene therapy in which potentially 
immunogenic responses are to be avoided, host, such as human, derived 
5 selectable markers or markers detectable with monoclonal antibodies 

(MAb) followed by fluorescent activated cell sorting (FACS) can be used. 
Examples in this class include, but are not limited to: human nerve growth 
factor receptor (detection with MAb); truncated human growth factor 
receptor (detection with MAb); mutant human dihydrofolate reductase 
10 (DHFR; detectable using a fluorescent methotrexate substrate); secreted 
alkaline phosphatase (SEAP; detectable with fluorescent substrate); 
thymidylate synthase (TS; confers resistance to fluorodeoxyUridine); 
human CAD gene (confers resistance to N-phosphonacetyl-L-aspartate 
(PALA)). 

1 5 To construct a platform artificial chromosome with a single site, an 

ACes artificial chromosome (or other artificial chromosome of interest) 
can be produced containing a selectable marker. A single sequence 
specific recombination site is targeted onto ACes via homologous 
recombination. For this, DNA sequences containing the site-specific 

20 recombination sequence are flanked with DNA sequences homologous to 
a selected sequence in the chromosome. For example, when using a 
chromosome containing rDNA or satellite DNA, such DNA can be used as 
homologous sequences to target the site-specific recombination sequence 
onto the chromosome, A vector is designed to have these homologous 

25 sequences flanking the site-specific recombination site and, after the 

appropriate restriction enzyme digest to generate free ends of homology 
to the chromosome, the DNA is transfected into cells harboring the 
chromosome. After transfection and integration of the site-specific 
cassette, homologous recombination events onto the platform 
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chromosome are subcloned and identified, for example by screening 

single cell subclones via expression of resistance or a fluorescent marker 

and PCR analysis. In one embodiment, a platform artificial chromosome, 

such as a platform ACes, that contains a single copy of the recombination 

5 site is selected. Examples 2B and 2D exemplify the process, and Figure 3 

provides a diagram depicting one method for the creation of a platform 

mammalian chromosome containing a single sequence-specific 

recombination site. 

5. Lambda integrase mediated recombination of target gene 
1 0 expression vector onto platform chromosome 

The third component of the chromosome-based platform 

technology involves the use of target gene expression vectors carrying, 

for example, genes for gene therapy, genes for transgenic animal or plant 

production, and those required for cellular protein production of interest. 

15 Using lambda integrase mediated site-specific recombination, or any other 
recombinase-mediated site-specific recombination, the target gene 
expression vectors are introduced onto the selected chromosome 
platform. The use of target gene expression vector permits use of the de 
novo generated chromosome-based platforms for a wide range of gene 

20 targets. Furthermore, chromosome platforms containing multiple attP 

sites provides the opportunity to incorporate multiple gene targets onto a 
single platform, thereby providing for expression of multiple gene targets, 
including the expression of cellular and genetic regulatory genes and the 
expression of all or parts of metabolic pathways. In addition to 

25 expressing small target genes, such as cDNA and hybrid cDNA/artificial 
intron constructs, the chromosome-based platform can be used for 
engineering and expressing large genomic fragments carrying target genes 
along with its endogenous genomic promoter sequences. This is of 
importance, for example, where the therapy requires precise cell specific 
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expression and in instances where expression is best achieved from 
genomic clones rather than cDNA clones. Figure 9 provides a diagram 
summarizing one embodiment of the chromosome-based technology. 

A feature of the target gene expression vector that is of interest to 
5 include is a promoterless marker gene, which as exemplified (see. Figure 
9) contains an upstream attB site (marker 2 on Figure 9). The nucleic 
acid encoding the marker is not expressed unless it is placed downstream 
from a promoter sequence. Using the recombinase technology provided 
herein, such as the lambda integ rase technology (/UNT E174R on figure 8) 

10 provided herein, site-specific recombination between the attB site on the 
vector and the promoter-attP site (in the "sense" orientation) on the 
chromosome-based platform results in the expression of marker 2 on the 
target gene expression vector, thereby providing a positive selection for 
the lambda INT mediated site-specific recombination event. Site-specific 

1 5 recombination events on the chromosome-based platform versus random 
integrations next to a promoter in the genome (false positive) can be 
quickly screened by designing primers to detect the correct event by PCR. 
Examples of suitable marker 2 genes, include, but are not limited to, 
genes that confer resistance to toxic compounds or antibiotics, 

20 fluorescence activated cell sorting (FACS) sortable cell surface markers 
and various fluorescent markers. Examples of these genes include, but 
are not limited to, human L26a R (human homolog of Saccharomyces 
cerevisiae CYH 8 gene), neomycin, puromycin, blasticidin, CD24 (see, e.g., 
US Patents 5,804,177 and 6,074,836), truncated CD4, truncated low 

25 affinity nerve growth factor receptor (LNGFR), truncated LDL receptor, 
truncated human growth hormone receptor, GFP, RFP, BFP. 

The target gene expression vectors contain a gene (target gene) for 
expression from the chromosome platform. The target gene can be 
expressed using various constitutive or regulated promoter systems 
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across various mammalian species. For the expression of multiple target 
genes within the same target gene expression vector, the expression of 
the multiple targets can be coordinately regulated via viral-based or 
human internal ribosome entry site (IRES) elements (see, e.g., Jackson et 
5 al. (1990) Trends Biochem Sci. 75: 477-83; Oumard et at. (2000) Mol. 
Cell. Biol. 20: 2755-2759). Furthermore, using IRES type elements linked 
to a downstream fluorescent marker, e.g., green, red or blue fluorescent 
proteins (GFP, RFP, BFP) allows for the identification of high expressing 
clones from the integrated target gene expression vector. 

10 In certain embodiments described herein, the promoterless marker 

can be transcriptionally downstream of the heterologous nucleic acid, 
wherein the heterologous nucleic acid encodes a heterologous protein, 
and wherein the expression level of the selectable marker is 
transcriptionally linked to the expression level of the heterologous protein. 

15 In addition, the selectable marker and the heterologous nucleic acid can 
be transcriptionally linked by the presence of a IRES between them. As 
set forth herein the selectable marker is selected from the group 
consisting of an antibiotic resistance gene, and a detectable protein, 
wherein the detectable protein is chromogenic or fluorescent. 

20 Expression from the target gene expression vector integrated onto the 
chromosome-based platform can be further enhanced using genomic 
insulator/boundary elements. The incorporation of insulator sequences 
into the target gene expression vector helps define boundaries in 
chromatin structure and thus minimizes influence of chromatin position 

25 effects/gene silencing on the expression of the target gene (Bell et al. 

(1999) Current Opinion in Genetics and Development S:191 -198; Emery 
etal. (2000) Proc. Natl. Acad. Sci. U.S.A. ,97:9150-9155). Examples of 
insulator elements that can be included onto target gene expression 
vector in order to optimize expression include, but are not limited to: 
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1) 



chicken jff-globin HS4 element (Prioleau et al. (1999) EMBO J 
18: 4035-4048); 



5 



2) 



3) 



matrix attachment regions (MAR; see, e.g. , Ramakrishnan et 
al. (2000) Mo/ Cell. Biol. 20:868-877); 
scaffold attachment regions (SAR; see, e.g. , Auten et al. 
(1999) Human Gene Therapy 70:1389-1399); and 



4) universal chromatin opening elements (UCOE; WO/0005393 
and WO/0224930) 

The copy number of the target gene can be controlled by 

lO sequentially adding multiple target gene expression vectors containing the 
target gene onto multiple integration sites on the chromosome platform. 
Likewise, the copy number of the target gene can be controlled within an 
individual target gene expression vector by the addition of DNA 
sequences that promote gene amplification. For example, gene 

1 5 amplification can be induced utilizing the dihydrofolate reductase (DHFR) 
minigene with subsequent selection with methotrexate (see, e.g., 
Schimke (1984) Cell 37:703-713) or amplification promoting sequences 
from the rDNA locus (see, e.g., Wegner et al. (1989) Nucl. Acids Res. 17: 
9909-9932). 

20 6. Platforms with other recombinase system sites 

A "double lox" targeting strategy mediated by Cre-recombinase 
(Bethke et al. (1997) Nucl. Acids Res. 25:2828-2834) can be used. This 
strategy employs a pair of heterospecific lox sites — loxA and loxB, which 
differ by one nucleotide in the 8 bp spacer region. Both sites are 

25 engineered into the artificial chromosome and also onto the targeting DNA 
vector. This allows for a direct site-specific insertion of a commercially 
relevant gene or genes by a Cre-catalyzed double crossover event. In 
essence a platform ACes is engineered with a hygromycin-resistance gene 
flanked by the double lox sites generating lox-ACes, which is maintained 
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in the thymidine kinase deficient cell, LMtk(-). The gene of interest, for 
example, for testing purposes, the green fluorescence protein gene, GFP 
and a HSV thymidine kinase gene (tk) marker, are engineered between the 
appropriate /ox sites of the targeting vector. The vector DNA is 
5 cotransfected with plasmid pBS185 (Life Technologies) encoding the Cre 
recombinase gene into mammalian cells maintaining the dual-/ox artificial 
chromosome. Transient expression of the Cre recombinase catalyzes the 
site-specific insertion of the gene and the tk-gene onto the artificial 
chromosome. The transfected cells are grown in HAT medium that 

10 selects for only those cells that have integrated and expressed the 

thymidine kinase gene. The HAT R colonies are screened by PCR analyses 
to identify artificial chromosomes with the desired insertion. 

To generate the lox-ACes, Lambda-Hyg R -/ox DNA is transfected 
into the LMtk(-) cell line harboring the precursor ACes. Hygromycin- 

1 5 resistant colonies are analyzed by FISH and Southern blotting for the 
presence of a single copy insert on the ACes. 

To demonstrate the gene replacement technology, cell lines 
containing candidate lox-ACes are cotransfected with pTK-GFP-/ox and 
pBS185 {encoding the Cre recombinase gene) DNA. After transfection, 

20 transient expression of plasmid pBS185 will provide sufficient burst of 

Cre recombinase activity to catalyze DNA recombination at the /ox sites. 
Thus, a double crossover event between the ACes target and the 
exogenous targeting plasmid carrying the loxA and /oxB permits the 
simple replacement of the hygromycin-resistance gene on the lox-ACes 

25 for the tk-GFP cassette from the targeting plasmid, with no integration of 
vector DNA. Transfected cells are grown in HAT-media to select for tk- 
expression. Correct targeting will result in the generation of HAT*, 
hygromycin sensitive, and green fluorescent cells. The desired integration 
event is verified by Southern and PCR analyses. Specific PCR primer sets 
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are used to amplify DNA sequences flanking the individual foxA and loxB 
sites on the fox-ACes before and after homologous recombination. 
D. Exemplary applications of the Platform ACes 

Platform ACes are applicable and tractable for different/optimized 
5 cell lines. Those that include a fluorescent marker, for example, can be 
purified and isolated using fluorescent activated cell sorting (FACS), and 
subsequently delivered to a target cell. Those with selectable markers 
provide for efficient selection and provide a growth advantage. Platform 
ACes allow multiple payload delivery of donor target vectors via a 
10 positive-selection site-specific, recombination system, and they allow for 
the inclusion of additional genetic factors that improve protein production 
and protein quality. 

The construction and use of the platform ACes as provided for 
each application may be similarly applied to other applications. Particular 
15 descriptions are for exemplification. 

1. Cellular Protein Production Platform ACes (CPP ACes) 
As described herein, ACes can be produced from acrocentric 
chromosomes in rodent (mouse, hamster) cell lines via megareplicator 
induced amplification of heterochromatin/rDNA sequences. Such ACes 
20 are ideal for cellular protein production as well as other applications 

described herein and known to those of skill in the art. ACes platforms 
that contain a plurality of recombination sites are particularly suitable for 
engineering as cellular protein production systems. 

In one embodiment, CPP ACes involve a two-component system: 
25 the platform chromosome containing multiple engineering sites and the 

donor target vector containing a platform-specific recombination site with 
designed expression cassettes (see Figure 9). 

The platform ACes can be produced from any artificial 
chromosome, particularly the amplification-based artificial chromosomes. 
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For exemplification, they are produced from rodent artificial chromosomes 
produced from acrocentric chromosomes using the technology of U.S. 
Patent Nos. 6,077,697 and 6,025,155 and published International PCT 
application No. WO 97/40183, in which nucleic acid is targeted to the 
5 pericentric heterochromatic, and, particularly into rDNA to initiate the 
replication event(s). The ACes can be produced directly in the chosen 
cellular protein production cell lines, such as, but not limited to, CHO 
cells, hybridomas, plant celts, plant tissues, plant protoplasts, stem cells 
and plant calli. 

lO a. Platform Construction 

In the exemplary embodiment, the initial de novo platform 
construction requires co-transfecting with excess targeting DNA, such as, 
rDNA or lambda DNA without an attP region, and an engineered 
selectable marker. The engineered selectable marker should contain 

15 promoter, generally a constitutive promoter, such as human, viral, i.e., 
adenovirus or SV40 promoter, including the human ferritin heavy chain 
promoter (SEQ ID NO:128), SV40 and EF1a promoters, to control 
expression of a marker gene that provides a selective growth advantage 
to the cell. An example of such a marker gene is the E. coli hisD gene 

20 (encoding histidinol dehydrogenase) which is homologous and analogous 
to the typhimurium hisD a dominant marker selection system for 
mammalian cells previously described (see, Hartman eta/. (1988) Proc. 
Natl. Acad. Sci. U.S.A. 55:8047-8051). Since histidine is an essential 
amino acid in mammals and a nutritional requirement in cell culture, the E. 

25 coli h/'sD gene can be used to select for histidine prototrophy in defined 
media. Furthermore more stringent selection can be placed on the cells 
by including histinol in the medium. Histidinol is itself permeable and 
toxic to cells. The hisD provides a means of detoxification. 
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Placed between the promoter and the marker gene is the 
bacteriophage lambda attP site to use the bacteriophage lambda integrase 
dependent site-specific recombination system (described herein). The 
insertion of an attP site downstream of a promoter element provide 
5 forward selection of site-specific recombination events onto the platform 
ACes. 

b. Donor Target Vector Construction 

A second component of the CPP platform ACes system involves 
the construction of donor target vectors containing a gene product(s) of 

10 interest for the CPP platform ACes. Individual donor target vectors can 
be designed for each gene product to be expressed thus enabling 
maximum usage of a de novo constructed platform ACes, so that one or 
a few CPP platform ACes will be required for many gene targets. 

A key feature of the donor vector target is the promoter/ess marker 

15 gene containing an upstream attB site (marker 2 on figure 9). Normally 
the marker would not be expressed unless it is placed downstream of a 
promoter sequence. As discussed above, using the lambda integrase 
technology WINT E174R on Figure 8 and Figure 9), site-specific 
recombination between the attB site on the vector and the promoter-affP 

20 site on the CPP platform ACes result in the expression of the donor target 
vector marker providing positive selection for the site-specific event. Site- 
specific recombination events on the CPP ACes versus random 
integrations next to a promoter in the genome (false positive) can be 
quickly screened by designing primers to detect the correct event by PCR. 

25 In addition, since the lambda integrase reaction is unidirectional, i.e. 
excision reaction is not possible, a number of unique targets can be 
loaded onto the CPP platform ACes limited only by the number of markers 
available. 
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Additional features of the donor target vector include gene target 
expression cassettes flanked by either chromatin insulator regions, matrix 
attachment regions (MAR) or scaffold attachment regions (SAR). The use 
of these regions will provide a more "open" chromatin environment for 
5 gene expression and help alleviate silencing. An example of such a 
cassette for expressing a monoclonal antibody is described. For this 
purpose, a strong constitutive promoter, e.g. chicken >£-actin or RNA Poll, 
is used to drive the expression of the heavy and light chain open reading 
frames. The heavy and light chain sequences flank a nonattenuated 

10 human IRES (IRES H ; from the 5'UTR of NRF1 gene; see Oumard et al., 
20OO, MoL and Cell Bio/. , 20(8):2755-2759) element thereby 
coordinating transcription of both heavy and light chain sequence. Distal 
to the light chain open reading frame resides an additional viral encoded 
IRES (IRES V modified ECMV internal ribosomal entry site (IRES)) element 

15 attenuating the expression of the fluorescent marker gene hrGFP from 
Renllla (Stratagene). By linking the hrGFP with an attenuated IRES, the 
heavy and light chains along with the hrGFP are monocistronic. Thus, the 
identification of hrGFP fluorescing cells will provide a means to detect 
protein producing cells. In addition, high producing cell lines can be 

20 identified and isolated by FACS thereby decreasing the time frame in 

finding high expressers. Functional monoclonal antibody will be 

confirmed by ELISA. 

c. Additional components in cellular protein production 
platform ACes (CPP Aces) 

25 In addition to the aforementioned CPP ACes system, other genetic 

factors can be included to enhance the yield and quality of the expressed 

protein. Again to provide maximum flexibility, these additional factors 

can be inserted onto the CPP platform ACes by >UNTE174R dependent 

site-specific recombination. Other factors that could be used with a CPP 
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Platform ACes include for example, adenovirus E1a transactivation 

system which upregulates both cellular and viral promoters (see, e.g., 

Svensson and Akusjarvi (1984) EMBO 3:789-794; and US patents 

5,866,359; 4,775,630 and 4,920,21 1). 

5 d. Targets for QYAO-ACes engineering to enhance cell 

growth, such as CHO cell growth and protein 
production/ quality 

If adding these additional factors onto the CPP ACes is not prudent 

or desired, the host cell, CHO cells, can be engineered to express these 

10 factors (see, below, targets for CHO-ACes engineering to enhance CHO 

cell growth and protein production/quality). Additional factors to consider 
including are addition of insulin or IGF-1 to sustain viabililty; 
human sialyltransferases or related factors to produce more human-like 
glycoproteins; expression of factors to decrease ammonium accumulation 

1 5 during cell growth; expression of factors to inhibit apoptosis; expression 

of factors to improve protein secretion and protein folding; and expression 

of factors to permit serum-free transfection and selection. 

1) Addition of insulin or IGF-1 to sustain 
viabililty 

20 Stimulatory factors and/or their receptors are expressed to set up 

an autocrine loop, to improve cell growth, such as CHO cell growth. Two 
exemplary candidates are insulin and IGF-1 (see, Biotechnol Prog 2OO0 
Sep;1 6(5):693-7). Insulin is the most commonly used growth factor for 
sustaining cell growth and viability in serum-free Chinese hamster ovary 

25 (CHO) cell cultures. Insulin and IGF-1 analog (LongR(3) serve as growth 
and viability factors for CHO cells. 

CHO cells were modified to produce higher levels of essential 
nutrients and factors. A serum-free (SF) medium for dihydrofolate 
reductase-deficient Chinese hamster ovary ceils (DG44 cells) was 

30 prepared. Chinese hamster ovary cells (DG44 cells), which are normally 
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maintained in 10% serum medium, were gradually weaned to 0.5% 
serum medium to increase the probability of successful growth in SF 
medium (see, Kim eta/. (199) In Vitro Cell Dev Biol Anim 35(4}:M8-32). 
A SF medium (SF-DG44) was formulated by supplementing the basal 
5 medium with these components; basal medium was prepared by 

supplementing Dulbecco's modified Eagle's medium and Ham's nutrient 
mixture F12 with hypoxanthine (10 mg/l) and thymidine (10 mg/l). 
Development of a SF medium for DG44 cells was facilitated using a 
Plackett-Burman design technique and weaning of cells. 

10 

2) Human sialyltransferases or related 
factors to produce more human-like 
glycoproteins 

CHO cells have been modified by increasing their ability to process 
1 5 protein via addition of complex carbohydrates. This has been achieved by 
overexpression of relevant processing enzymes, or in some cases, 
reducing expression of relevant enzymes (see, Bragonzi et al. (2000) 
Biochim Biophys Acta 7474^:273-282; see, also Weikert et aL (1999) 
Nature biotech. 77:1 1 1 6-1 1 1 21 ; Ferrari J et aL (1 998) Biotechnol Bioeng 
20 60{5) :S39-9S). A CHO cell line expressing alpha2,6-sialyltransferase was 
developed for the production of human-like sialylated recombinant 
glycoproteins. The sialylation defect of CHO cells can be corrected by 
transfecting the alpha2,6-sialyltransferase (alpha2,6-ST) cDNA into the 
cells. Glycoproteins produced by such CHO cells display alpha2,6-and 
25 alpha2,3-linked terminal sialic acid residues, similar to human 
glycoproteins. 

As another example for improving the production of human-like 
sialylated recombinant glycoproteins, a CHO cell line has been developed 
that constitutively expresses sialidase antisense RNA (see, Ferrari J et aL 
30 (1998) Biotechnol Bioeng 6>CY5^:589~95). Several antisense expression 



WO 02/097059 PCT/US02/17452 



-69- 

vectors were prepared using different regions of the sialidase gene. Co- 
transfection of the antisense constructs with a vector conferring 
puromycin resistance gave rise to over 40 puromycin resistant clones that 
were screened for sialidase activity. A 5' 474 bp coding segment of the 
5 sialidase cDNA, in the inverted orientation in an SV 40-based expression 
vector, gave maximal reduction of the sialidase activity to about 40% 
wild-type values. 

Oligosaccharide biosynthesis pathways in mammalian cells have 
been engineered for generation of recombinant glycoproteins (see, e.g., 

10 Sburlati (1998) Biotech no/ Prog 14(2): 1 89-92), which describes a Chinese 
hamster ovary (CHO) cell line capable of producing bisected 
oligosaccharides on glycoproteins. This cell line was created by 
overexpression of a recombinant N-acetylglucosaminyltransferase III (GnT- 
III) (see, also, Prati et at. (1998) Biotechnol Bioeng 5,9^:445-50, which 

15 describes antisense strategies for glycosylation engineering of CHO cells). 

3) Expression of factors to decrease 
ammonium accumulation during cell 
growth 

Excess ammonium, which is a by-product of CHO cell metabolism 
20 can have detrimental effects on cell growth and protein quality (see, Yang 
era/. (2000) Biotechnol Bioeng 68 {4) :370-SO). To solve this problem 
ammonium levels were modified by overexpressing carbamoyl phosphate 
synthetase I and ornithine transcarbamoylase or glutamine synthetase in 
CHO cells. Such modification resulted in reduced ammonium levels 
25 observed and an increase in the growth rate (see Kim et al. (2000) J 
Biotechnol 8 1 (2-3) : 1 29-40; and Enosawa et aL (1997) Cell Transplant 
675^:537-40). 

4) Expression of factors to improve protein 
secretion and protein folding 
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Overexpression of relevant enzymes can be engineered into the 
ACes to improve protein secretion and folding. 

5) Expression of factors to permit serum-free 
transfection and selection 

5 It is advantageous to have the ability to convert CHO cells in 

suspension growing in serum free medium to adherence with out having 

to resort to serum addition. Laminin or fibronectin addition is sufficient to 

make cells adherent (see, e.g., Zaworski era/. (1993) Biotechniques 

75(5):363-6) so that expressing either of these genes in CHO cells under 

10 an inducible promoter should allow for reversible shift to adherence 

without requiring serum addition. 

2. Platform ACes and Gene Therapy 

The platform ACes provided herein are contemplated for use in 
mammalian gene therapy, particularly human gene therapy. Human ACes 

15 can be derived from human acrocentric chromosomes from human host 

cells, in which the amplified sequences are heterochromatic and/or human 
rDNA. Different platform ACes applicable for different tissue cell types 
are provided. The ACes for gene therapy can contain a single copy of a 
therapeutic gene inserted into a defined location on platform ACes. 

20 Therapeutic genes include genomic clones, cDNA, hybrid genes and other 
combinations of sequences. Preferred selectable markers are those from 
the mammalian host, such as human derived factors so that they are non- 
immunogenic, non-toxic and allow for efficient selection, such as by 
FACS and/or drug resistance. 

25 Platform ACes, useful for gene therapy and other applications, as 

noted herein, can be generated by megareplicator dependent 
amplification, such as by the methods in U.S. Patent Nos. 6,077,697 and 
6,025,155 and published International PCT application No. 
WO 97/40183. In one embodiment, human ACes are produced using 
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human rDNA constructs that target rDNA arrays on human acrocentric 
chromosomes and induce the megareplicator in human cells, particularly 
in primary cell lines (with sufficient number of doublings to form the 
ACes) or stem cells (such as hematopoietic stem cells, mesenchymal 
5 stem cells, adult stem cells or embryonic stem cells) to avoid the 

introduction of potentially harmful rearranged DNA sequences present in 
many transformed cell lines. Megareplicator induced ACes formation can 
result in multiple copies of targeting DNA/selectable markers in each 
amplification block on both chromosomal arms of the platform ACes. 

10 In view of the considerations regarding immunogenicity and 

toxicity, the production of human platform ACes for gene therapy 
applications employs a two component system analogous to the platform 
- ACes designed for cellular protein production (CPP platform ACes). The 
system includes a platform chromosome of entirely human DNA origin 

1 5 containing multiple engineering sites and a gene target vector carrying the 
therapeutic gene of interest. 

a. Platform Construction 
The initial de novo construction of the platform chromosome 
employs the co-transfection of excess targeting DNA and a selectable 

20 marker. In one embodiment, the DNA is targeted to the rDNA arrays on 
the human acrocentric chromosomes (chromosomes 13, 14, 15, 21 and 
22). For example, two large human rDNA containing PAC clones 18714 
and 18720 and the human PAC clone 558F8 are used for targeting 
(Genome Research (ML) now Incyte, BACPAC Resources, 747 52nd 

25 Street, Oakland CA). The mouse rDNA clone pFK161 (SEQ ID NO: 1 18), 
which was used to make the human SATAC from the 94-3 
hamster/human hybrid cell line (see, e.g., published International PCT 
application No. WO 97/401 83 and Csonka, et al. Journal of Cell Science 
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/ 73:3207-321 61 and Example 1 for a description of pFK161) can also be 
used. 

For animal applications, selectable markers should be non- 
immunogenic in the animal, such as a human, and include, but are not 
5 limited to: human nerve growth factor receptor (detected with a MAb, 
such as described in US patent 6,365,373); truncated human growth 
factor receptor (detected with MAb), mutant human dihyrofolate 
reductase (DHFR; fluorescent MTX substrate available); secreted alkaline 
phosphatase (SEAP; fluorescent substrate available); human thymidylate 

10 synthase (TS; confers resistance to anti-cancer agent fluorodeoxyuridine); 
human glutathione S-transf erase alpha (GSTA1; conjugates glutathione to 
the stem cell selective alkylator busulfan; chemoprotective selectable 
marker in CD34+ cells); CD24 cell surface antigen in hematopoietic stem 
cells; human CAD gene to confer resistance to N-phosphonacetyl-L- 

15 aspartate (PALA); human multi-drug resistance- 1 (MDR-1 ; P-glycoprotein 
surface protein selectable by increased drug resistance or enriched by 
FACS); human CD25 <IL-2<7; detectable by Mab-FITC); Methylguanine- 
DNA methyltransf erase (MGMT; selectable by carmustine); and Cytidine 
deaminase (CD; selectable by Ara-C). 

20 Since megareplicator induced amplification generates multiple 

copies of the selectable marker, a second consideration for the selection 
of the human marker is the resulting dose of the expressed marker after 
ACes formation. High level of expression of certain markers may be 
detrimental to the cell and/or result in autoimmunity. One method to 

25 decrease the dose of the marker protein is by shortening its half-life, such 
as via the fusion of the well-conserved human ubiquitin tag (a 76 amino 
acid sequence) thus leading to increased turnover of the selectable 
marker. This has been used successfully for a number of reporter 
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systems including DHFR (see, e.g., Stack et al. (2000) Nature 
Biotechnology 75:1298-1302 and references cited therein). 

Using the ubiquitin tagged protein, a human selectable marker 
system analogous to the CPP ACes described herein is constructed. 
5 Briefly, a tagged selectable marker, such as for example one of those 
described herein, is cloned downstream of an attP site and expressed 
from a human promoter. Exemplary promoters contemplated for use 
herein include, but are not limited to, the human ferritin heavy chain 
promoter (SEQ ID NO: 128); RNA Poll; EF1 o; TR; glyceraldehyde-3- 

10 phosphate dehydrogenase core promoter (GAP); a GAP core promoter 
including a proximal insulin inducible element the intervening GAP 
sequence; phosphofructokinase promoter; and phosphoglycerate kinase 
promoter. Also contemplated herein is an aldolase A promoter H1 & H2 
(representing closely spaced transcriptional start sites) along with the 

15 proximal H enhancer. There are 4 promoters (e.g., transcriptional start 
sites) for this gene, each having different regulatory and tissue activity. 
The H (most proximal 2) promoters are ubiquitously expressed off the H 
enhancer. This resulting marker can then be co-transfected along with 
excess human rDNA targeting sequence into the host cells. An important 

20 criteria for the selection of the 

recipient cells is sufficient number of cell doublings for the formation and 
detection of ACes. Accordingly, the co-transfections should be 
attempted in human primary cells that can be cultured for long periods of 
25 time, such as for example, stem cells (e.g., hematopoietic, mesenchymal, 
adult or embryonic stem cells), or the like. Additional cell types, include, 
but are not limited to: single gene transfected cells exhibiting increased 
life-span; over-expressing c-myc cells, e.g. MSU1.1 (Morgan et al., 1991, 
Exp. Cell Res., Nov;1 97(1 ):1 25-1 36); over-expressing telomerase lines, 
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such as TERT cells; SV40 large T-antigen transfected lines; tumor cell 
lines, such as HT1080; and hybrid human cell lines, such as the 94-3 
hamster/human hybrid cell line. 

b- Gene Target Vector 
5 The second component of the GT platform ACes (GT ACes) system 

involves the use of engineered target vectors carrying the therapeutic 
gene of interest. These are introduced onto the GT platform ACes via 
site-specific recombination. As with the CPP ACes, the use of engineered 
target vectors maximizes the use of the de novo generated GT platform 

10 ACes for most gene targets. Furthermore, using lambda integrase 

technology, GT platform ACes containing multiple attP sites permits the 
opportunity to incorporate multiple therapeutic targets onto a single 
platform. This could be of value in cases where a defined therapy 
requires multiple gene targets, a single therapeutic target requires an 

15 additional gene regulatory factor or a GT ACes requires a "kill" switch. 

Similar to the CPP ACes, a feature of the gene target vector is the 
promoter/ess marker gene containing an upstream attB site (marker 2 on 
' Figure 9). Normally, the marker (in this case, a cell surface antigen that 
can be sorted by FACS would be ideal) would not be expressed unless it 

20 is placed downstream of a promoter sequence. Using the lambda 

integrase technology WINT E174R on figure 9), site-specific recombination 
between the attB site on the vector and the promoter- attP site on the GT 
platform ACes results in the expression of marker#2 on the gene target 
vector, i.e. positive selection for the site-specific event. Site-specific 

25 recombination events on the GT ACes versus random integrations next to 
a promoter in the genome (false positive) can be quickly screened by 
designing primers to detect the correct event by PCR. 

For expression of the therapeutic gene, human specific promoters, 
such as a ferritin heavy chain promoter (SEQ ID NO: 128); EFIcr or RNA 
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Poll, are used. These promoters are for high level expression of a cDNA 

encoded therapeutic protein. In addition to expressing cDNA (or even 

hybrid cDN A/artificial intron constructs), the GT platform ACes are used 

for engineering and expressing large genomic fragments carrying 

5 therapeutic genes of interest expressed from native promoter sequences. 

This is of importance in situations where the therapy requires precise cell 

specific expression or in instances where expression is best achieved 

from genomic clones versus cDNA. 

3. Selectable markers for use, for example, in Gene 
1 0 Therapy (GT) 

The following are selectable markers that can be incorporated into 

human ACes and used for selection. 

Dual Resistance to 4-Hydroperoxycyclophosphamide 
and Methotrexate by Retroviral Transfer of the Human 
1 5 Aldehyde Dehydrogenase Class 1 Gene and a Mutated 

Dihydrofolate Reductase Gene 

The genetic transfer of drug resistance to hematopoietic cells is one 

approach to overcoming myelosuppression caused by high-dose 

chemotherapy- Because cyclophosphamide (CTX) and methotrexate 

20 (MTX) are commonly used non-cross-resistant drugs, generation of dual 

drug resistance in hematopoietic cells that allows dose intensification may 
increase anti-tumor effects and circumvent the emergence of drug- 
resistant tumors, a retroviral vector containing a human cytosolic ALDH- 
1 -encoding DNA clone and a human doubly mutated DHFR-encoding 

25 clone (Phe22/Ser31 ; termed F/S in the description of constructs) to 

generate increased resistance to CTX and MTX were constructed (Takebe 
et al. (2001) Mot Ther 3(7):33-96). This construct may be useful for 
protecting patients from high-dose CTX- and MTX-induced 
myelosuppression. ACes can be similarly constructed. 
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Multiple mechanisms of N-phosphonacetyl-L-aspartate 
resistance in human cell lines: carbamyl-P 
synthetase/aspartate transcarbamylase/dihydro-orotase 
gene amplification is frequent only when chromosome 
5 2 is rearranged 

Rodent cells resistant to N-phosphonacetyl-L-aspartate (PALA) 

invariably contain amplified carbamyl-P synthetase/aspartate 

transcarbamylase/dihydro-orotase (CAD) genes, usually in widely spaced 

tandem arrays present as extensions of the same chromosome arm that 

10 carries a single copy of CAD in normal cells (Smith eta/. (1997) Proc. 

Natl. Acad. Sci. U.S.A. 94: 181 6-21). In contrast, amplification of CAD is 
very infrequent in several human tumor cell lines. Cell lines with minimal 
chromosomal rearrangement and with unrearranged copies of 
chromosome 2 rarely develop intrachromosomal amplifications of CAD. 

15 These cells frequently become resistant to PALA through a mechanism 

that increases the aspartate transcarbamylase activity with no increase in 
CAD copy number, or they obtain one extra copy of CAD by forming an 
isochromosome 2p or by retaining an extra copy of chromosome 2. In 
cells with multiple chromosomal aberrations and rearranged copies of 

20 chromosome 2, amplification of CAD as tandem arrays from rearranged 

chromosomes is the most frequent mechanism of PALA resistance. All of 

these different mechanisms of PALA resistance are blocked in normal 

human fibroblasts- Thus, ACes with multiple copies of the CAD gene 

would provide PALA resistance. 

25 Retroviral coexpression of thymidylate synthase and 

dihydrofolate reductase confers fluoropyrimidine and 
antifolate resistance 

Retroviral gene transfer of dominant selectable markers into 

hematopoietic cells can be used to select genetically modified cells in vivo 

30 or to attenuate the toxic effects of chemotherapeutic agents. Fantz et a/. 

((1998) Biochem Biophys Res Comm 243(7):e~'\ 2) have shown that 
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retroviral gene transfer of thymidylate synthase (TS) confers resistance to 
TS directed anticancer agents and that co-expression of TS and 
dihydrofolate reductase (DHFR) confers resistance to TS and DHFR 
cytotoxic agents. Retroviral vectors encoding Escherichia coii TS, human 
5 TS, and the Tyr-to-His at residue 33 variant of human TS (Y33HhTS) 

were constructed and fibroblasts transfected with these vectors conferred 
comparable resistance to the TS-directed agent fluorodeoxyuridine 
(FdUrd, approximately 4-fold). Retroviral vectors that encode dual 
expression of Y33HhTS and the human L22Y DHFR (L22YhDHFR) 

10 variants conferred resistance to FdUrd (3- to 5-fold) and trimetrexate (30- 
to 140-fold). A L22YhDHFR-Y33HhTS chimeric retroviral vector was also 
constructed and transduced cells were resistant to FdUrd (3-fold), AG337 
(3-fold), trimetrexate (100-fold) and methotrexate (5-fold). These results 
show that recombinant retroviruses can be used to transfer the cDNA 

15 that encodes TS and DHFR and dual expression in transduced cells is 

sufficiently high to confer resistance to TS and DHFR directed anticancer 

agents. ACes can be similarly constructed. 

Human CD34+ cells do not express glutathione S- 
transferases alpha 

20 The expression of glutathione S-transferases alpha (GST alpha) in 

human hematopoietic CD34+ cells and bone marrow was studied using 
RT-PCR and immunoblotting (Czerwinski M, Kiem et al. (1997) Gene Ther 
4(3):263-70). The GSTA1 protein conjugates glutathione to the stem cell 
selective alkylator busulfan. This reaction is the major pathway of 

25 elimination of the compound from the human body. Human hematopoietic 
CD34+ cells and bone marrow do not express GSTA1 message, which 
was present at a high level in liver, an organ relatively resistant to 
busulfan toxicity in comparison to bone marrow. Similarly, baboon 
CD34+ cells and dog bone marrow do not express GSTA1 . Thus, human 
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GSTA1 is a chemoprotective selectable marker in human stem cell gene 

therapy and could be employed in ACes construction. 

Selection of retrovirally transduced hematopoietic cells 
using CD24 as a marker of gene transfer 

5 Pawliuk et aL ((1994) Blood 5-4^:2868-2877) have investigated 

the use of a cell surface antigen as a dominant selectable marker to 

facilitate the detection and selection of retrovirally infected target cells. 

The small coding region of the human cell surface antigen CD24 

(approximately 240 bp) was introduced into a myeloproliferative sarcoma 

10 virus (MPSV)-based retroviral vector, which was then used to infect day 4 
5-fluorouracil (5-FU)-treated murine bone marrow cells. Within 48 hours 
of termination of the infection procedure CD24-expressing cells were 
selected by fluorescent-activated cell sorting (FACS) with an antibody 
directed against the CD24 antigen. Functional analysis of these cells 

15 showed that they included not only in vitro clonogenic progenitors and 
day 1 2 colony-forming unit-spleen but also cells capable of competitive 
long-term hematopoietic repopulation. Double-antibody labeling studies 
performed on recipients of retrovirally transduced marrow cells showed 
that some granulocytes, macrophages, erythrocytes, and, to a lesser 

20 extent, B and T lymphocytes still expressed the transduced CD24 gene at 
high levels 4 months later. No gross abnormalities in hematopoiesis were 
detected in mice repopulated with CD24-expressing cells. These results 
show that the use of the CD24 cell surface antigen as a retrovirally 
encoded marker permits rapid, efficient, and nontoxic selection in vitro of 

25 infected primary cells, facilitates tracking and phenotyping of their 
progeny, and provides a tool to identify elements that regulate the 
expression of transduced genes in the most primitive hematopoietic cells. 
ACes could be similarly constructed. 
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DeltahGHR, a biosafe cell surface-labeling molecule for 
analysis and selection of genetically transduced human 
cells 

A selectable marker for retroviral transduction and selection of 
5 human and murine cells is known (see, Garcia-Ortiz et al. (2000) Hum 
Gene Ther 1 7 (2):333-46). The molecule expressed on the cell surface of 
the transduced population is a truncated version of human growth 
hormone receptor (deltahGHR), capable of ligand (hGH) binding, but 
devoid of the domains involved in signal triggering. The engineered 

10 molecule is stably expressed in the target cells as an inert protein unable 
to trigger proliferation or to rescue the cells from apoptosis after ligand 
binding. This new marker, has a wide application spectrum, since hGHR 
in the human adult is highly expressed only in liver cells, and lower levels 
have been reported in certain lymphocyte cell populations. The 

15 deltahGHR label has high biosafety potential, as it belongs to a well- 
characterized hormonal system that is nonessential in adults, and there is 
extensive clinical experience with hGH administration in humans. The 
differential binding properties of several monoclonal antibodies (MAbs) are 
used in a cell rescue method in which the antibody used to select 

20 deltahGHR-transduced cells is eluted by competition with hGH or, 

alternatively biotinylated hGH is used to capture tagged cells. In the latter 

system, the final purified population is recovered free of attached 

antibodies in hGH (a substance approved for human use)-containing 

medium. Such a system could be used to identify ACes containing cells. 

25 4. Transgenic models for evaluation of genes and 

discovery of new traits in plants 

Of interest is the use of plants and plant cells containing artificial 

chromosomes for the evaluation of new genetic combinations and 

discovery of new traits. Artificial chromosomes, by virtue of the fact that 

30 they can contain significant amounts of DNA can also therefore encode 
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numerous genes and accordingly a multiplicity of traits. It is 
contemplated here that artificial chromosomes, when formed from one 
plant species, can be evaluated in a second plant species. The resultant 
phenotypic changes observed, for example, can indicate the nature of the 
5 genes contained within the DNA contained within the artificial 
chromosome, and hence permit the identification of novel genetic 
activities. Artificial chromosomes containing euchromatic DNA or partially 
containing euchromatic DNA can serve as a valuable source of new traits 
when transferred to an alien plant cell environment. For example, it is 

1 0 contemplated that artificial chromosomes derived from dicot plant species 
can be introduced into monocot plant species by transferring a dicot 
artificial chromosome. The dicot artificial chromosome possessing a 
region of euchromatic DNA containing expressed genes. 

The artificial chromosomes can be designed to allow the artificial 

1 5 chromosome to recombine with the naturally occurring plant DNA in such 
a fashion that a large region of naturally occurring plant DNA becomes 
incorporated into the artificial chromosome. This allows the artificial 
chromosome to contain new genetic activities and hence carry novel 
traits. For example, an artificial chromosome can be introduced into a 

20 wild relative of a crop plant under conditions whereby a portion of the 

DNA present in the chromosomes of the wild relative is transferred to the 
artificial chromosome. After isolation of the artificial chromosome, this 
naturally occurring region of DNA from the wild relative, now located on 
the artificial chromosome can be introduced into the domesticated crop 

25 species and the genes encoded within the transferred DNA expressed and 
evaluated for utility. New traits and gene systems can be discovered in 
this fashion. The artificial chromosome can be modified to contain 
sequences that promote homologous recombination within plant cells, or 
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be modified to contain a genetic system that functions as a site-specific 
recombination system. 

Artificial chromosomes modified to recombine with plant DNA offer 
many advantages for the discovery and evaluation of traits in different 
5 plant species- When the artificial chromosome containing DNA from one 
plant species is introduced into a new plant species, new traits and genes 
can be introduced. This use of an artificial chromosome allows for the 
ability to overcome the sexual barrier that prevents transfer of genes from 
one plant species to another species. Using artificial chromosomes in this 

10 fashion allows for many potentially valuable traits to be identified 

including traits that are typically found in wild species. Other valuable 
applications for artificial chromosomes include the ability to transfer large 
regions of DNA from one plant species to another, such as DNA encoding 
potentially valuable traits such as altered oil, carbohydrate or protein 

15 composition, multiple genes encoding enzymes capable of producing 

valuable plant secondary metabolites, genetic systems encoding valuable 
agronomic traits such as disease and insect resistance, genes encoding 
functions that allow association with soil bacterium such as growth 
promoting bacteria or nitrogen fixing bacteria, or genes encoding traits 

20 that confer freezing, drought or other stress tolerances. In this fashion, 
artificial chromosomes can be used to discover regions of plant DNA that 
encode valuable traits. 

The artificial chromosome can also be designed to allow the 
transfer and subsequent incorporation of these valuable traits now located 

25 on the artificial chromosome into the natural chromosomes of a plant 
species. In this fashion the artificial chromosomes can be used to 
transfer large regions of DNA encoding traits normally found in one plant 
species into another plant species. In this fashion, it is possible to derive 
a plant cell that no longer needs to carry an artificial chromosome to 
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posses the novel trait. Thus, the artificial chromosome would serve as 
the transfer mechanism to permit the formation of plants with greater 
degree of genetic diversity. 

The design of an artificial chromosome to accomplish the afore- 
5 mentioned purposes can include within the artificial chromosome the 
presence of specific DNA sequences capable of acting as sites for 
homologous recombination to take place. For example, the DNA 
sequence of Arabidopsis is now known. To construct an artificial 
chromosome capable of recombining with a specific region of Arabidopsis 

10 DNA, a sequence of Arabidopsis DNA, normally located near a 

chromosomal location encoding genes of potential interest can be 
introduced into an artificial chromosome by methods provided herein. It 
may be desirable to include a second region of DNA within the artificial 
chromosome that provides a second flanking sequence to the region 

1 5 encoding genes of potential interest, to promote a double recombination 
event which would ensure transfer of the entire chromosomal region, 
encoding genes of potential interest, to the artificial chromosome. The 
modified artificial chromosome, containing the DNA sequences capable of 
homologous recombination region, can then be introduced into 

20 Arabidopsis cells and the homologous recombination event selected. 

It is convenient to include a marker gene to allow for the selection 
of a homologous recombination event. The marker gene is preferably 
inactive unless activated by an appropriate homologous recombination 
event. For example, US 5,272,071, describes a method where an 

25 inactive plant gene is activated by a recombination event such that 

desired homologous recombination events can be easily scored. Similarly, 
US 5,501,967 describes a method for the selection of homologous 
recombination events by activation of a silent selection gene first 
introduced into the plant DNA, the gene being activated by an appropriate 
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homologous recombination event. Both of these methods can be applied 
to enable a selective process to be included to select for recombination 
between an artificial chromosome and a plant chromosome. Once the 
homologous recombination event is detected, the artificial chromosome, 
5 once selected, is isolated and introduced into a recipient cell, for example, 
tobacco, corn, wheat or rice, and the expression of the newly introduced 
DNA sequences evaluated. 

Phenotypic changes in the recipient plant cells containing the 
artificial chromosome, or in regenerated plants containing the artificial 

10 chromosome, allows for the evaluation of the nature of the traits encoded 
by the Arabidopsis DNA, under conditions naturally found in plant cells, 
including the naturally occurring arrangement of DNA sequences 
responsible for the developmental control of the traits in the normal 
chromosomal environment. 

15 Traits such as durable fungal or bacterial disease resistance, new 

oil and carbohydrate compositions, valuable secondary metabolites such 
as phytosterols, flavonoids, efficient nitrogen fixation or mineral 
utilization, resistance to extremes of drought, heat or cold are all found 
within different populations of plant species and are often governed by 

20 multiple genes. The use of single gene transformation technologies does 
not permit the evaluation of the multiplicity of genes controlling many 
valuable traits. Thus, incorporation of these genes into artificial 
chromosomes allows the rapid evaluation of the utility of these genetic 
combinations in heterologous plant species. 

25 The large scale order and structure of the artificial chromosome 

provides a number of unique advantages in screening for new utilities or 
novel phenotypes within heterologous plant species. The size of new 
DNA that can be carried by an artificial chromosome can be millions of 
base pairs of DNA, representing potentially numerous genes that may 
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have novel utility in a heterologous plant cell. The artificial chromosome 
is a "natural" environment for gene expression, the problems of variable 
gene expression and silencing seen for genes transferred by random 
insertion into a genome should not be observed. Similarly, there is no 
5 need to engineer the genes for expression, and the genes inserted would 
not need to be recombinant genes. Thus, one expects the expression 
from the transferred genes to be temporal and spatial, as observed in the 
species from where the genes were initially isolated. A valuable feature 
for these utilities is the ability to isolate the artificial chromosomes and to 
1 0 further isolate, manipulate and introduce into other cells artificial 
chromosomes carrying unique genetic compositions. 

Thus, the use of artificial chromosomes and homologous 
recombination in plant cells can be used to isolate and identify many 
valuable crop traits. 

15 In addition to the use of artificial chromosomes for the isolation and 

testing of large regions of naturally occurring DNA, methods for the use 
of artificial chromosomes and cloned DNA are also contemplated. Similar 
to that described above, artificial chromosomes can be used to carry large 
regions of cloned DNA, including that derived from other plant species. 

20 The ability to incorporate novel DNA elements into an artificial 

chromosome as it is being formed allows for the development of artificial 
chromosomes specifically engineered as a platform for testing of new 
genetic combinations, or "genomic" discoveries for model species such as 
Arabidopsis. It is known that specific "recombinase" systems can be 

25 used in plant cells to excise or re-arrange genes. These same systems 
can be used to derive new gene combinations contained on an artificial 
chromosome. 

The artificial chromosomes can be engineered as platforms to 
accept large regions of cloned DNA, such as that contained in Bacterial 
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Artificial Chromosomes (BACs) or Yeast Artificial Chromosomes (YACs). 
It is further contemplated, that as a result of the typical structure of 
artificial chromosomes containing tandemly repeated DNA blocks, that 
sequences other than cloned DNA sequence can be introduced by 
5 recombination processes. In particular recombination within a predefined 
region of the tandemly repeated DNA within the artificial chromosome 
provides a mechanism to "stack" numerous regions of cloned DNA, 
including large regions of DNA contained within BACs or YACs clones. 
Thus, multiple combinations of genes can be introduced onto artificial 

10 chromosomes and these combinations tested for functionality. In 

particular, it is contemplated that multiple YACs or BACs can be stacked 
onto an artificial chromosomes, the BACs or YACs containing multiple 
genes of complex pathways or multiple genetic pathways. The BACs or 
YACs are typically selected based on genetic information available within 

15 the public domain, for example from the Arabidopsis Information 

Management System (http://aims.cps.msu.edu/aims/index.html) or the 
information related to the plant DNA sequences available from the 
Institute for Genomic Research (http://www.tigr.org) and other sites 
known to those skilled in the art. Alternatively, clones can be chosen at 

20 random and evaluated for functionality. It is contemplated that 

combinations providing a desired phenotype can be identified by isolation 
of the artificial chromosome containing the combination and analyzing the 
nature of the inserted cloned DNA. 



25 recombination sequences can have considerable utility in developing 
artificial chromosomes containing DNA sequences recognized by 
recombinase enzymes and capable of accepting DNA sequences 
containing same. The use of site-specific recombination as a means to 
target an introduced DNA to a specific locus has been demonstrated in 



In this regard, it is contemplated that the use of site-specific 
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the art and such methods can be employed. The recombinase systems 
can also be used to transfer the cloned DNA regions contained within the 
artificial chromosome to the naturally occurring plant or mammalian 
chromosomes. 

5 As noted herein, many site-specific recombinases are known and 

can be identified (Kilby et al. (1993) Trends in Genetics 3:413-418). The 
three recombinase systems that have been extensively employed include: 
an activity identified as R encoded by the pSR1 plasmid of 
Zygosaccharomyes rouxii, FLP encoded for the 2um circular plasmid from 

10 Saccharomyces cerevisiae and Cre-fox from the phage PI . 

The integration function of site-specific recombinases is 
contemplated as a means to assist in the derivation of genetic 
combinations on artificial chromosomes. In order to accomplish this, it is 
contemplated that a first step of introducing site-specific recombinase 

1 5 sites into the genome of a plant cell in an essentially random manner is 
conducted, such that the plant cell has one or more site-specific 
recombinase recognition sequences on one or more of the plant 
chromosomes. An artificial chromosome is then introduced into the plant 
cell, the artificial chromosome engineered to contain a recombinase 

20 recognition site (e.g., integration site) capable of being recognized by a 
site-specific recombinase. Optionally, a gene encoding a recombinase 
enzyme is also included, preferably under the control of an inducible 
promoter. Expression of the site-specific recombinase enzyme in the 
plant cell, either by induction of a inducible recombinase gene, or 

25 transient expression of a recombinase sequence, causes a site-specific 
recombination event to take place, leading to the insertion of a region of 
the plant chromosomal DNA (containing the recombinase recognition site) 
into the recombinase recognition site of the artificial chromosome, and 
forming an artificial chromosome containing plant chromosomal DNA. 
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The artificial chromosome can be isolated and introduced into a 
heterologous host, preferably a plant host, and expression of the newly 
introduced plant chromosomal DNA can be monitored and evaluated for 
desirable phenotypic changes. Accordingly, carrying out this 
5 recombination with a population of plant cells wherein the chromosomally 
located recombinase recognition site is randomly scattered throughout the 
chromosomes of the plant, can lead to the formation of a population of 
artificial chromosomes, each with a different region of plant chromosomal 
DNA, and each potentially representing a novel genetic combination. 

10 This method requires the precise site-specific insertion of 

chromosomal DNA into the artificial chromosome. This precision has 
been demonstrated in the art. For example, Fukushige and Sauer ((1992) 
Proc. Natl. Acad. Sci. USA, 89:7905-7909) demonstrated that the Cre- 
lox homologous recombination system could be successfully employed to 

1 5 introduce DNA into a predefined locus in a chromosome of mammalian 
cells. In this demonstration a promoter-less antibiotic resistance gene 
modified to include a fox sequence at the 5' end of the coding region was 
introduced into CHO cells. Cells were re-transformed by electroporation 
with a plasmid that contained a promoter with a fox sequence and a 

20 transiently expressed Cre recombinase gene. Under the conditions 

employed, the expression of the Cre enzyme catalyzed the homologous 
recombination between the /ox site in the chromosomally located 
promoter-less antibiotic resistance gene, and the /ox site in the introduced 
promoter sequence, leading to the formation of a functional antibiotic 

25 resistance gene. The authors demonstrated efficient and correct targeting 
of the introduced sequence, 54 of 56 lines analyzed corresponded to the 
predicted single copy insertion of the DNA due to Cre catalyzed site- 
specific homologous recombination between the /ox sequences. 
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Accordingly a fox sequence may be first added to a genome of a 
plant species capable of being transformed and regenerated to a whole 
plant to serve as a recombinase target DNA sequence for recombination 
with an artificial chromosome. The fox sequence may be optimally 
5 modified to further contain a selectable marker which is inactive but can 
be activated by insertion of the fox recombinase recognition sequence into 
the artificial chromosome. 

A promoterless marker gene or selectable marker gene linked to the 
recombinase recognition sequence, which is first inserted into the 

1 0 chromosomes of a plant cell can be used to engineer a platform 

chromosome. A promoter is linked to a recombinase recognition site, in 
an orientation that allows the promoter to control the expression of the 
marker or selectable marker gene upon recombination within the artificial 
chromosome. Upon a site-specific recombination event between a 

15 recombinase recognition site in a plant chromosome and the recombinase 
recognition site within the introduced artificial chromosome, a cell is 
derived with a recombined artificial chromosome, the artificial 
chromosome containing an active marker or selectable marker activity 
that permits the identification and or selection of the cell. 

20 The artificial chromosomes can be transferred to other plant or 

animal species and the functionality of the new combinations tested. The 
ability to conduct such an inter-chromosomal transfer of sequences has 
been demonstrated in the art. For example, the use of the Cre-fox 
recombinase system to cause a chromosome recombination event 

25 between two chromatids of different chromosomes has been shown. 

Any number of recombination systems may be employed as 
described herein, such as, but not limited to, bacterially derived systems 
such as the att/int system of phage lambda, and the Gin/gix system. 



WO 02/097059 



PCT/US02/17452 



-89- 

More than one recombination system may be employed, including, 
for example, one recombinase system for the introduction of DNA into an 
artificial chromosome, and. a second recombinase system for the 
subsequent transfer of the newly introduced DNA contained within an 
5 artificial chromosome into the naturally occurring chromosome of a 

second plant species. The choice of the specific recombination system 
used will be dependent on the nature of the modification contemplated. 

By having the ability to isolate an artificial chromosome, in 
particular, artificial chromosomes containing plant chromosomal DNA 

10 introduced via site-specific recombination, and re-introduce the 

chromosome into other mammalian or plant cells, particularly plant cells, 
these new combinations can be evaluated in different crop species 
without the need to first isolate and modify the genes, or carry out 
multiple transformations or gene transfers to achieve the same 

1 5 combination isolation and testing combinations of the genes in plants. 
The use of a site-specific recombinase also allows the convenient 
recovery of the plant chromosomal region into other recombinant DNA 
vectors and systems, such as mammalian or insect systems, for 
manipulation and study. 

20 Also contemplated herein are ACes, cell lines and methods for use 

in screening a new chromosomal combinations, deletions, truncations 
with eucaryotic genome that take advantage of the site-specific 
recombination systems incorporated onto platform ACes provided herein. 
For example, provided herein is a cell line useful for making a library of 

25 ACes, comprising a multiplicity of heterologous recombination sites 
randomly integrated throughout the endogenous chromosomes. Also 
provided herein is a method of making a library of ACes comprising 
random portions of a genome, comprising introducing one or more ACes 
into a cell line comprising a multiplicity of heterologous recombination 
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sites randomly integrated throughout the endogenous chromosomes, 
under conditions that promote the site-specific chromosomal arm 
exchange of the ACes into, and out of, a multiplicity of the heterologous 
recombination sites within the cell's chromosomal DNA; and isolating said 
5 multiplicity of ACes, thereby producing a library of ACes whereby multiple 
ACes have different portions of the genome within. Also provided herein 
is a library of cells useful for genomic screening, said library comprising a 
multiplicity of cells, wherein each cell comprises an ACes having a 
mutually exclusive portion of a chromosomal nucleic acid therein. The 

lO library of cells can be from a different species and/or cell type than the 

chromosomal nucleic acid within the ACes. Also provided is a method of 
making one or more cell lines, comprising 

a) integrating into endogenous chromosomal DNA of a selected cell 
species, a multiplicity of heterologous recombination sites, 

15 b) introducing a multiplicity of ACes under conditions that promote 

the site-specific chromosomal arm exchange of the ACes into, and out of, 
a multiplicity of the heterologous recombination sites integrated within the 
cell's endogenous chromosomal DNA; 

c> isolating said multiplicity of ACes, thereby producing a library of 

20 ACes whereby a multiplicity of ACes have mutually exclusive portions of 
the endogenous chromosomal DNA therein; 

d) introducing the isolated multiplicity of ACes of step c) into a 
multiplicity of cells, thereby creating a library of cells; 

e) selecting different cells having mutually exclusive ACes therein 
25 and clonally expanding or differentiating said different cells into clonal cell 

cultures, thereby creating one or more cell lines. 

These ACes, cell lines and methods utilize the site-specific 
recombination sites on platform ACes analogous YAC manipulation 
related to: the methods of generating terminal deletions in normal and 
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artificial chromosomes (e.g., ACes; as described in Vollrath et al., 1988, 
PNAS, USA, 85:6027-66031; and Pavan et al., FN AS, USA, 87:1300- 
1 304 ); the methods of generating interstitial deletions in normal and 
artificial chromosomes (as described in Campbell et al., 1991, PNAS, 
5 USA, 888:5744-5748); and the methods of detecting homologous 

recombination between two ACes (as described in Cellini et al., 1991, 
Nuc. Acid Res., 1 9(5):997-1000). 

5. Use of plateform ACes in Pharmacogenomic/toxicology 
applications (development of "Reporter ACes") 

10 In addition to the placement of genes onto ACes chromosomes for 

therapeutic protein production or gene therapy, the platform can be 
engineered via the IntR lambda integrase to carry reporter-linked 
constructs (reporter genes) that monitor changes in cellular physiology as 
measured by the particular reporter gene (or a series of different reporter 

1 5 genes) readout. The reporter linked constructs are designed to include a 
gene that can be detected (by for example fluorescence, drug resistance, 
immunohistochemistry, or transcript production, and the like) with well- 
known regulatory sequences that would control the expression of the 
detectable gene. Exemplary regulatory promoter sequences are well- 

20 known in the art: 

A) Reporter ACes for drug pathway screening 
The ACes can be engineered to carry reporter-linked constructs 
that indicate a signal is being transduced through one or a number of 
pathways. For example, transcriptionally regulated promoters from genes 

25 at the end (or any other chosen point) of particular signal transduction 
pathways could be engineered on the ACes to express the appropriate 
readout (either by fluorescent protein production or drug resistance) when 
the pathway is activated (or down-regulated as well). In one 
embodiment, a number of reporters from different can be placed on a 
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ACes chromosome. Cells (and/or whole animals) containing such a 
Reporter ACes could be exposed to a variety of drugs or compounds and 
monitored for the effects of the drugs or compounds upon the selected 
pathway(s) by the reporter gene(s). Thus, drugs or compounds can be 
5 classified or identified by particular pathways they excite or down- 
regulate. Similarly, transcriptional profiles obtained from genomic array 
experiments can be biologically validated using the reporter ACes 
provided herein. 

B) Reporter ACes for toxic compound testing 

10 Environmental or man-made genotoxicants can be tested in cell 

lines carrying a number of reporter-genes platform ACes linked to 
promoters that are transcriptionally regulated in response to DNA damage, 
induced apoptosis or necrosis, and cell-cycle perturbations. Furthermore, 
new drugs and/or compounds could be tested in a similar manner with the 

15 genotoxicant ACes reporter for their cellular/genetic toxicity by such a 
screen. Likewise, toxic compound testing could be carried out in whole 
transgenic animals carrying the ACes chromosome that measures 
genotoxicant exposure ("canary in a coal mine"). Thus, the same or 
similar type ACes could be used for toxicity testing in either a cell-based 

20 or whole animal setting. An example would include ACes that carry 
reporter-linked genes controlled by various cytochrome P450 profiled 
promoters and the like. 

C) Reporter ACes for individualized pharmacogenomics/drug 

profiling 

25 A common disease may arise via various mechanisms. In many 

instances there are multiple treatments available for a given disease. 
However, the success of a given treatment may depend upon the 
mechanism by which the disease originated and/or by the genetic 
background of the patient. In order to establish the most effective 
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treatment for a given patient one could utilize the ACes reporters provided 
herein. ACes reporters can be used in patient cell samples to determine 
an individualized drug regimen for the patient. In addition, potential 
polymorphisms affecting the transcriptional regulation of an individual's 
5 particular gene can be assessed by this approach. 

D) Reporter ACes for classification of similar patient tumors 
As with other diseases as described in 5.C) above, cancer cells 

arise via different mechanisms. Furthermore, as a cancerous cell 
propagates it may undergo genomic alterations. An ACes reporter 

10 transferred to cells of different patients having the same disease, i.e. 

similar cancers, could be used to categorize the particular cancer of each 
patient, thereby facilitating the identification of the most effective 
therapeutic regimen. Examples would include the validation of array 
profiling of certain classes of breast cancers. Subsequently, appropriate 

15 drug profiling could be carried out as described above. 

E) Reporter ACes as a "differentiation" sensor 

Using the ACes reporter as a "differentiation" sensor in stem cells 
or other progenitor cells in order to enrich by selection (either FACS based 
screening, drug selection and/or use of suicide gene) for a particular class 
20 of differentiated or undifferentiated cells. For example, in one 

embodiment, this assay could also be used for compound screening for 
small molecule modifiers of cell differentiation. 

F) Whole animal studies with Reporter ACes 

Finally, with whole-body fluorescence imaging technology (Yang et 
25 al. (2000) PNAS 97:12278) any of the above Reporter ACes methods 

could be used in conjunction with whole-body imaging to monitor reporter 
genes within whole animals without sacrificing the animal. This would 
allow temporal and spatial analysis of expression patterns under a given 
set of conditions. The conditions tested may include for example, normal 
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differentiation of a stern cell, response to drug or compound treatment 
whether targeted to the diseased tissue or presented systemically, 
response to genotoxicants, and the like. 

The following examples are included for illustrative purposes only 
5 and are not intended to limit the scope of the invention. 

EXAMPLE 1 

pFK161 

Cosmid pFK161 (SEQ ID NO: 118) was obtained from Dr. Gyula 
Hadlaczky and contains a 9 kb Not\ insert derived from a murine rDNA 

10 repeat (see clone 161 described in PCT Application Publication No. 

WO97/40183 by Hadlaczky et aL for a description of this cosmid). This 
cosmid, referred to as clone 161 contains sequence corresponding to 
nucleotides 10,232-15,000 in SEQ ID NO. 26. It was produced by 
inserting fragments of the megachromosome (see, U.S. Patent No. 

15 6,077,697 and International PCT application No. WO 97/40183). For 
example, H1D3, which was deposited at the European Collection of 
Animal Cell Culture (ECACC) under Accession No. 96040929, is a 
mouse-hamster hybrid cell line carrying this megachromosome into 
plasmid pWE1 5 (Stratagene, La Jolla, California; SEQ ID No. 31) as 

20 follows. Half of a 100 jj\ low melting point agarose block (mega-plug) 
containing isolated SATACs was digested with Not\ overnight at 37 °C. 
Plasmid pWE15 was similarly digested with Not] overnight. The mega- 
plug was then melted and mixed with the digested plasmid, ligation buffer 
and T4 DNA ligase. Ligation was conducted at 16°C overnight. Bacterial 

25 DH5a cells were transformed with the ligation product and transformed 
cells were plated onto LB/Amp plates. Fifteen to twenty colonies were 
grown on each plate for a total of 189 colonies. Plasmid DNA was 
isolated from colonies that survived growth on LB/Amp medium and 
analyzed by Southern blot hybridization for the presence of DNA that 



PCT/US02/17452 

WO 02/097059 



-95- 



hybridized to a pUC19 probe. This soreening methodoiogy assurec I the, 
a„ clenes. even clones ..eking an insert bu, ye, oontelning the pWE1 
plasmid, would be detected. 

Liquid cultures of all 1 89 transformants were used to generate 
5 cosmid minipreps for analysis of restriction sites within the insert DNA^ 
Six of the original 1 89 cosmid Cones contained an insert. These clones 
were designated as follows: 28 <~9-kb insert). 30 <~ 9-kb insert , 60 
(~4-kb insert), 113 <~9-kb insert,, 157 <~9-kb insert, and 161 ~9-kb 
insert). Restriction enzyme analysis indicated that three of the Cones 
1 0 (1 1 3 1 57 and 1 61 ) contained the same insert. 

For sequence analysis the insert of cosmid Cone no. 161 was 
subc .oned as fol.ows. To obtain the end fragments of the insert ol P Cone 
no 161, the clone was digested with Not\ and BamM and ..gated wrth 
M^/S-HI-digested P B,uescri P t KS <Stratagene, La Jol.a, 
, 5 Two fragments of the insert of Cone no. 1 61 were obta.ned: a 0 2-Kb and 
I 0.7-kb insert fragment. To subCone the interna, fragment of 
of Cone no. 1 61 , the same digest was ..gated with ^H.-digested 
PUC19. Three fragments of the insert of Cone no. 161 were obta.ned: a 
0 6-kb, a 1 .8-kb and a 4.8-kb insert fragment. 
20 The insert corresponds to an interna, section of the mouse 

ribosoma. RNA gene (rDNA) repeat unit between positions 7551-1 5670 
as set forth in GENBANK accession no. X82564, which is prov.ded as 
SEQ ID NO. 18. The sequence data obtained for the insert of Cone no. 
1 61 is set forth in SEQ ID NOS. 1 9-25. SpecificaNy, the individual 
25 subclones corresponded to the following positions in GENBANK access.on 
no. X82564 (SEQ ID NO: 18) and in SEQ ID NOs. 19-25: 



WO 02/097059 PCT/US02/17452 



-96- 



Subclone 


Start 


End 


Site 


SEQ ID No. 




in X82564 






161k1 


7579 


7755 


Not\, BamH\ 


19 


161m5 


7756 


8494 


BamH\ 


20 


161m7 


8495 


10231 


BamH\ 


21 (shows only sequence corresponding 
to nt. 8495-8950), 

22 (shows only sequence corresponding 
to nt. 9851- 10231) 


161m12 


10232 


15000 


BamH\ 


23 (shows only sequence corresponding 
to nt. 10232-10600), 

24 (shows only sequence corresponding 
to nt. 14267-15000) 


161k2 


15001 


15676 


Not\. BamH\ 


25 



The sequence set forth in SEQ ID NOs. 19-25 diverges in some 
10 positions from the sequence presented in positions 7551-15670 of 

GENBANK accession no. X82564. Such divergence may be attributable 
to random mutations between repeat units of rDNA. 

For use herein, the rDNA insert from the clone was prepared by 
digesting the cosmid with Not\ and Bgf\\ and was purified as described 
15 above- Growth and maintenance of bacterial stocks and purification of 

plasmids were performed using standard well known methods (see, e.g. , 
Sambrook et af. (1 989) Molecular Cloning: A Laboratory Manual, 2nd 
Edition, Cold Spring Harbor Laboratory Press), and plasmids were purified 
from bacterial cultures using Midi - and Maxi-preps Kits (Qiagen, 
20 Mississauga, Ontario). 
pDsRed1N1 

This vector is available from Clontech (see SEQ ID No. 29) and 
encodes the red fluorescent protein (DsRed; Genbank accession no. 
AF27271 1; SEQ ID Nos. 39 and 40). DsRed, which has a vivid red 
25 fluorescence, was isolated from the IndoPacific sea anemone relative 

D/scosoma species. The plasmid pDsRed1N1 (Clontech; SEQ ID No. 29) 
constitutively expresses a human codon-optimized variant of the 



WO 02/097059 



PCT7US02/17452 



-97- 

fluorescent protein under control of the CMV promoter. Unmodified, this 
vector expresses high levels of DsRedl and includes sites for creating N- 
terminal fusions by cloning proteins of interest into the multiple cloning 
site (MCS). It is Kan and Neo resistant for selection in bacterial or 
5 eukaryotic cells. 
Plasmid pMG 

Plasmid pMG (InvivoGen, San Diego, California; see SEQ. ID. NO. 
27 for the nucleotide sequence of pMG) contains the hygromycin 
phosphotransferase gene under the control of the immediate-early human 

10 cytomegalovirus (hCMV) enhancer/promoter with intron A. Vector pMG 
also contains two transcriptional units allowing for the coexpression of 
two heterologous genes from a single vector sequence. 

The first transcriptional unit of pMG contains a multiple cloning site 
for insertion of a gene of interest, the hygromycin phosphotransferase 

1 5 gene (hph) and the immediate-early human cytomegalovirus (hCMV) 

enhancer/promoter with intron A (see, e.g., Chapman eta/. (1991) Nuc. 
Acids Res. 73:3979-3986) located upstream of hph and the multiple 
cloning site, which drives the expression of hph and any gene of interest 
inserted into the multiple cloning site as a polycistronic mRNA. The first 

20 transcriptional unit also contains a modified EMCV internal ribosomal 

entry site (IRES) upstream of the hph gene but downstream of the hCMV 
promoter and MCS for ribosomal entry in translation of the hph gene (see 
SEQ ID NO. 27, nucleotides 2736-3308). The IRES is modified by 
insertion of the constitutive E. coli promoter (EM7) within an intron (IM7) 

25 into the end of the IRES. In mammalian cells, the E. coli promoter is 
treated as an intron and is spliced out of the transcript. A 
polyadenylation signal from the bovine growth hormone (bGh) gene (see, 
e.g., Goodwin and Rottman (1992) J. Biol. Chem. 257:16330-16334) 
and a pause site derived from the 3' flanking region of the human cr2 
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globin gene (see, e.g., Enriquez-Harris eta/. (1991) EMBO J. 10: 1 833- 
1 842) are located at the end of the first transcription unit. Efficient 
polyadenylation is facilitated by inserting the flanking sequence of the 
bGh gene 3' to the standard AAUAAA hexanucleotide sequence. 
5 The second transcriptional unit of pMG contains another multiple 

cloning site for insertion of a gene of interest and an EF-1a/HTLV hybrid 
promoter located upstream of this multiple cloning site, which drives the 
expression of any gene of interest inserted into the multiple cloning site. 
The hybrid promoter is a modified human elongation factor- 1 alpha (EF-1 

10 alpha) gene promoter (see, e.g., Kim eta/. (1990) Gene 3 7:217-223) 
that includes the R segment and part of the U5 sequence (R-U5') of the 
human T-cell leukemia virus (HTLV) type I long terminal repeat (see, e.g., 
Takebe et aL (1 988) Mo/. Ce/I. Bio/ 5:466-472). The Simian Virus 40 
(SV40) late polyadenylation signal (see Carswell and Alwine (1989) Mo/. 

15 Cell. Biol. ,9:4248-4258) is located downstream of the multiple cloning 
site. Vector pMG contains a synthetic polyadenylation site for the first 
and second transcriptional units at the end of the transcriptional unit 
based on the rabbit /?-globin gene and containing the AATAAA 
hexanucleotide sequence and a GT/T-rich sequence with 22-23 

20 nucleotides between them (see, e.g., Levitt et aL (1939) Genes Dev. 

3:1019-1025). A pause site derived from the C2 complement gene (see, 
Moreira et al. (1995) EMBO J. 7^:3809-3819) is also located at the 3' 
end of the second transcriptional unit. 

Vector pMG also contains an ori sequence (ori pMBI) located 

25 between the SV40 polyadenylation signal and the synthetic 
polyadenylation site. 

EXAMPLE 2 

A. Construction of targeting vector and transfection into LMtk- cells 
for the generation of platform chromosomes 
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A targeting vector derived from the vector pWE15 (GeneBank 
Accession # X65279) was modified by replacing the Sail (Klenow 
filled)/S/nai neomycin resistance containing fragment with the 
PvuWIBamYW (Klenow filled) puromycin resistance containing fragment 
5 (isolated from plasmid pPUR, Clontech Laboratories, Inc. Palo Alto, CA; 
SEQ ID No. 30) resulting in plasmid pWEPuro. Subsequently a 9 Kb Not\ 
fragment from the plasmid pFK161 (SEQ ID NO: 118) containing a portion 
of the mouse rDNA region was cloned into the Not\ site of pWEPuro 
resulting in plasmid pWEPuro9K (Figure 2). The vector pWEPuro9K was 

10 digested with Spe\ to linearize and transfected into LMtk- mouse cells. 
Puromycin resistant colonies were isolated and subsequently tested for 
artificial chromosome formation via fluorescent in situ hybridization (FISH) 
(using mouse major and minor DNA repeat sequences, the puromycin 
gene and telomeres sequences as probes), and fluorescent activated cell 

15 sorting (FACS). From this sort, a subclone was isolated containing an 
artificial chromosome, designated 5B1 1.12, which carries 4-8 copies of 
the puromycin resistance gene contained on the pWEPuro9K vector. 
FISH analysis of the 5B1 1.12 subclone demonstrated the presence of 
telomeres and mouse minor on the^Ces. DOT PCR has been done on 

20 the 5B1 1.12 ACes revealing the absence of uncharacterized euchromatic 
regions on the ACes. A recombination site, such as an att or ioxP 
engineering site or a plurality thereof, was introduced onto Xh\s ACes 
thereby providing a platform for site-specific introduction of heterologous 
nucleic acid. 

25 B. Targeting a single sequence specific recombination site onto 
platform chromosomes 

After the generation of the 5B1 1.12 platform, a single sequence- 
specific recombination site is placed onto the platform chromosome via 
homologous recombination. For this, DNA sequences containing the site- 
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specific recombination sequence can be flanked with DNA sequences of 
homology to the platform chromosome. For example, using the platform 
chromosome made from the pWEPuro9K vector, mouse rDNA sequences 
or mouse major satellite DNA can be used as homologous sequences to 
5 target onto the platform chromosome. A vector is designed to have these 
homologous sequences flanking the site-specific recombination site and, 
after the appropriate restriction enzyme digest to generate free ends of 
homology to the platform chromosome, the DNA is transfected into cells 
harboring the platform chromosome (Figure 3). Examples of site-specific 

10 cassettes that are targeted to the platform chromosome using either 
mouse rDNA or mouse major repeat DNA include the SV40-attP-hygro 
cassette and a red fluorescent protein (RFP) gene flanked by loxP sites 
(Cre/lox, see, e.g., U.S. Patent No. 4,959,317 and description herein). 
After transfection and integration of the site-specific cassette, 

15 homologous recombination events onto the platform chromosome are 

subcloned and identified by FACS (e.g. screen and single cell subclone via 
expression of resistance or fluorescent marker) and PCR analysis. 

For example, a vector can be constructed containing regions of the 
mouse rDNA locus flanking a gene cassette containing the SV40 early 

20 reporter-bacteriophage lambda attP site-hygromycin selectable marker 

(see Figure 4 and described below). The use of the bacteriophage lambda 
attP site for lambda integrase-mediated site-specific recombination is 
described below. Homologous recombination event of the SV40-attP- 
hygro cassette onto the platform chromosome was identified using PCR 

25 primers that detect the homologous recombination and further confirmed 
by FISH analysis. After identifying subcloned colonies containing the 
platform chromosome with a single site-specific recombination site, cells 
carrying the platform chromosome with a single site-specific 
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recombination site can now be engineered with site-specific recombinases 
(e.g. lambda INT, Cre) for integrating a target gene expression vector. 

C. Targeting a red fluorescent protein (RFP) gene flanked by loxP sites 
5 onto 5B1 1 . 1 2 platform 

As another example, while loxP recombination sites could have 
been introduced onto the ACes during de novo biosynthesis, it was 
thought that this might result in multiple segments of the ACes containing 
a high number of loxP sites, potentially leading to instability upon Cre- 

10 mediated recombination. A gene targeting approach was therefore 

devised to introduce a more limited number of loxP recombination sites 
into a locus of the 5B1 1-12 ACes containing introduced and possibly co- 
amplified endogenous rDNA sequences. Although there are more than 
200 copies of rDNA genes in the haploid mouse genome distributed 

15 amongst 5-11 chromosomes (depending on strain), rDNA sequences were 
chosen as the target on the ACes since they represent a less frequent 
target than that of the satellite repeat sequences. Moreover, having 
observed much stronger pWEPuro9K hybridization to the 5B1 1-12>4Ces 
than to other LMTK" chromosomes and in light of the observation that the 

20 transcribed spacer sequences within the rDNA may be less conserved 

than the rRNA coding regions, it was contemplated that a targeting vector 
based on the rDNA gene segment in pWEPuro9K would have a higher 
probability of targeting to the ACes rather than to other LMTK" 
chromosomes. Accordingly, a targeting vector, pBSFKLoxDsRedLox, was 

25 designed and constructed based on the rDNA sequences contained in 
pWEPuro9K. 

The plasmid p BS F K Lox Ds Red Lox was generated in 4 steps. First, 
the Not\ rDNA insert of pWEPuro9K (Figure 2) was inserted into pBS SK- 
(Stratagene) giving rise to pBSFK. Second, a loxP polylinker cassette was 
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generated by PCR amplification of pNEB193 (SEQ ID NO:32; New 
England Biolabs) using primers complementary to the M13 forward and 
reverse priming sites at their 3'end and a 34 bp 5' extension comprising a 
LoxP site. This cassette was reinserted into pNEB1 93 generating 
5 p193LoxMCSLox. Third, the DsRed gene from pDsRedl-NI (SEQ ID 
NO:29; Clontech) was then cloned into the polylinker between the loxP 
sites generating p1 93LoxDsRedLox. Fourth, a fragment consisting of the 
DsRed gene flanked by loxP sites was cloned into a unique Nde\ within 
the rDNA insert of pBSFK generating pBSFKLoxDsRedLox. 

10 A gel purified 1 1 Kb Pmf\ /EcoRV fragment of pBSFKLoxDsRedLox 

was used for transfection. To detect targeted integration, PCR primers 
were designed from rDNA sequences within the 5' Not\-PmI\ fragment of 
pWEPuro9K that is not present on the targeting fragment (5'primer) and 
sequence within the LoxDsRedLox cassette (3' primer). If the targeting 

1 5 DNA integrated correctly within the rDNA sequences, PCR amplification 
using these primers would give rise to a 2.3 Kb band. PCR reactions 
containing 1-4 //I of genomic DNA were carried out according to the 
MasterTaq protocol (Eppendorf), using murine rDNA 5' primer (5'- 
CGGACAATGCGGTTGTGCGT-3'; SEQ ID NO:72) and DsRed 3'primer 

20 <5'GGCCCCGTAATGCAGAAGAA-3'; SEQ ID NO:73) and PCR products 
were analyzed by agarose gel electrophoresis. 

1 .5X1 0 6 5B1 1-1 2 LMTK" cells were transfected with 2 jug of the 
pBSFKLoxDsRedLox targeting DNA described above using Lipofectamine 
Plus (Invitrogen). For flow sorting, harvested cells were suspended in 

25 medium and applied to the Becton Dickinson Vantage SE cell sorter, 

equipped with 488 nm lasers for excitation and 585/42 bandpass filter for 
optimum detection of RFP fluorescence. Cells were sorted using dPBS as 
sheath buffer. Negative control parental 5B1 1-12 cells and a positive 
control LMTK" cell line stably transfected with DsRed were used to 
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establish the selection gates. The RFP positive gated populations were 
recovered, diluted in medium supplemented with 1X penicillin- 
streptomycin (Invitrogen), then plated and cultured as previously 
described. After 4 rounds of enrichment, the percentage of RFP positive 
5 cells reached levels of 50% or higher. DNA from populations was 

analyzed by PCR for evidence of targeted integration. Ultimately, single 
cell subclones were established from positive pools and were analyzed by 
PCR and PCR-positive clones confirmed by FISH as described below. 
DNA was purified from pools or single cell clones using previously 

10 described methods set forth in Lahm et at., Transgenic Res. , 1998; 

7:131-134, or in some cases using a Wizard Genomic DNA purification kit 
(Promega). For FISH analysis, a biotinylated DsRed gene probe was 
generated by PCR using DsRed specific primers and biotin-labeled dUTP 
(5' RFP primer: 5 ' -G GTTTA A AGTG CGCTCCTCC A AG A ACGTC ATC-3 ' , 

15 SEQ ID NO:74; and 3' RFP primer: 

5'AGATCTAGAGCCGCCGCTACAGGAACAGGTGGTGGCGGCC-3'; SEQ 
ID NO:75). To maximize the signal intensity of the DsRed probe, 
Tyramide amplification was carried out according to the manufacturers 
protocols (NEN). 

20 The process of testing the feasibility of a more general targeting 

strategy that would not rely on enrichment via drug selection of stably 
transfected clones can be summarized as follows. A red fluorescent 
protein gene (RFP; encoded by the DsRed gene) was inserted between the 
loxP sites of the targeting vector to form pBSFKLoxDsRedLox. After 

25 transfection with PBSFKLoxDsRedLox, sequential rounds of high speed 

flow sorting and expansion of sorted cells in culture could then be used to 
enrich for stable transformants expressing RFP. In the event of targeted 
integration, PCR screening with primers that amplify from a spacer region 
within the segment of the 45s pre-rRNA gene in pWEPuro9K to a specific 
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anchor sequence within the DsRed gene in the targeting cassette would 
give rise to a diagnostic 2.3 Kb band. However, as rDNA clusters are 
found on several chromosomes, confirmation of targeting to an ACes 
would require fluorescence in situ hybridization (FISH) analysis. Finally, 
5 the flanking of the DsRed gene by loxP sites would allow for its removal 
and subsequent replacement with other genes of interest. 

After transfection of the targeting sequence into 5B1 1-12 cells, 
enrichment for targeted clones was carried out using a combination of 
flow cytometry to detect red-fluorescing cells and PCR screening. 

10 Ultimately 17 single cell subclones were identified as potential targeted 

clones by PCR and of these 1 6 were found by FISH to contain the DsRed 
integration event into the ACes. These subclones are referred to herein 
as D11-C4, D11-C12, D11-H3, C9-C9, C9-B9, C9-F4, C9-H8, C9-F2, C9- 
G8, C9-B6, C9-G3, C9-E12, C9-A11, C11-E3, C11-A9 and C11-H4. PCR 

15 analysis of genomic DNA isolated from the D1 1-C4 subclone gave rise to 
a 2.3 Kb band, indicative of a targeted integration into an rDNA locus. 
Further analysis of the subclone by FISH analysis with a DsRed gene 
probe demonstrated integration of the LoxDsRedLox targeting cassette on 
the ACes co-localizing with one of the regions of rDNA staining seen on 

20 the 5B1 1-12 ACes, consistent with a targeted integration into an rDNA 
locus of the ACes, while integrations on other chromosomes were not 
observed. Since transfected cells were maintained as heterogeneous 
populations through several cycles of sorting and replating it was not 
possible to estimate the frequency of targeted events. In most 

25 mammalian cell lines the frequency of gene targeting via homologous 
recombination is roughly 10' 5 -10" 7 treated cells. Despite the low 
frequency of these events in mammalian cells, it is clear that an RFP 
expression based screening paradigm, coupled with PCR analysis, can 
effectively detect and enrich for such infrequent events in a large 
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population. In instances where drug selection is not possible or not 
desirable, such a system may provide a useful alternative. It was also 
verified that the modified ACes in subclone D1 1-C4 could be purified by 
flow cytometry. The results indicate that the flow karyogram of the D1 1- 
5 C4 subclone was unaltered from that of the 5B1 1-12 cell line. Thus, the 
D1 1-C4 ACes can be purified in high yield from native chromosomes of 
the host cell line. 

D. Reduction of LoxP on ACes to a single site. 

10 The strong hybridization signal detected by FISH on the ACes using 

the DsRed gene probe suggests that several copies of the targeting 
cassette may be present on the ACes in the D1 1-C4 line. This also 
suggests that multiple rDNA genes have been correctly targeted. 

Accordingly, in certain embodiments where necessary, the number 

15 of loxP sites on the ACes can be reduced to a single site by in situ 

treatment with Cre recombinase, provided that the sites are co-linear. 
Such a process is described for multiple loxP-flanked integrations on a 
native mouse chromosome (Garrick et al.. Nature Genet. . 1998, 
Jan;1 8(1):56-59). Reduction to a single loxP site on the D11-C4 ACes 

20 would result in the loss of the DsRed gene, forming the basis of a useful 
screen for this event. 

For this purpose, a Cre expression plasmid pCX-Cre/GFP III has 
been generated by first deleting the EcoRI fragment of pCX-eGFP (SEQ ID 
NO:71) containing the eGFP coding sequence and replacing it with that of 

25 a PCR amplified Cre recombinase coding sequence (SEQ ID NO:58), 
generating pCX-Cre. Next, the Asel/Sspl fragment of pD2eGFP-N1 
(containing the CMV promoter driving the D2EGFP gene with SV40 polyA 
signal; Clontech; SEQ ID NO:87) was inserted into the filled Hindlll site of 
pCX-Cre, generating pCX-Cre\GFP III. Control plasmid pCX-CreRev\GFP 
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10 



15 



20 



III was generated in similar fashion except that the Cre recombinase 
coding sequence was inserted in the antisense orientation. LMTK' cell 
line D1 1-C4 (containing first generation platform ACes with multiple loxP- 
DsRED sites) and 5B11-12 cell line {containing ACes with no loxP-DsRED 
sites) are maintained in culture as described above. D1 1 C4 cells are 
transfected with 2 jag of plasmid pCX-Cre\GFP III or 2 jjg pCX- 
CreRev\GFP III using Lipofectamine (Invitrogen) as previously described. 

Forty-eight to seventy-two hours after transfection, transfected 
D1 1-C4 cells are harvested and GFP positive cells are sorted by cell 
cytometry using a FACSta Vantage cell sorter (Beckton-Dickinson) as 
follows: All D1 1-C4 cells transfected with pCX-Cre\GFP 111 or control 
plasmid pCX-CreRev\GFP III that exhibit GFP fluorescent higher than the 
gate level established by untransfected cells are collected and placed in 
culture a further 7-14 days. After 7-14 days the initial D1 1-C4 cells are 
harvested and analyzed by cell cytometry as follows: Untransfected D1 1- 
C4 cells are used to establish the gate that defines the RFP positive 
population, while 5B1 1-12 cells are used to set the RFP negative gate. 
The GFP positive population of D11-C4 transfected with pCX-Cre\GFP III 
should show decreased red fluorescence compared to pCX-CreRev\GFP III 
transfected or untransfected control D1 1-C4 cells. The cells exhibiting 
greatly decreased or no RFP expression are collected and single cell 
clones subsequently established. These clones will be expanded and 
analyzed by fluorescence fn-situ hybridization and Southern blotting to 
confirm the removal of loxP-DsRed gene copies. 



Construction of targeting vector and transfection into LMtk- cells for the 
generation of platform chromosomes containing multiple site-specific 
recombination sites 



EXAMPLE 3 
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An example of a selectable marker system for the creation of a 
chromosome-based platform is shown in Figure 4. This system includes a 
vector containing the SV40 early promoter immediately followed by (1) a 
282 base pair (bp) sequence containing the bacteriophage lambda attP 
5 site and (2) the puromycin resistance marker. Initially a Pvu\\IStu\ 
fragment containing the SV40 early promoter from plasmid pPUR 
(Clontech Laboratories, Inc., Palo Alto, CA; Seq ID No. 30) was 
subcloned into the EcoRMCR\ site of pNEB193 (a PUC19 derivative 
obtained from New England Biolabs, Beverly, MA; SEQ ID No. 32) 

10 generating the plasmid pSV40193. The only differences between pUC19 
and pNEB193 are in the polylinker region. A unique AscI site 
(GGCGCGCC) is located between the BamH\ site and the Smal site, a 
unique Pad site (TTAATTAA) is located between the BamH\ site and the 
Xba\ site and a unique Pmel site (GTTTAAAC) is located between the Pst\ 

1 5 site and the Sal\ site. 

The attP site was PCR amplified from lambda genome (GenBank 
Accession # NC 001416) using the following primers: 

attPUP: CCTTGCGCTAATGCTCTGTTACAGG SEQ ID No. 1 
attPDWN: CAGAGGCAGGGAGTGGGACAAAATTG SEQ ID No. 2 

20 After amplification and purification of the resulting fragment, the 

attP site was cloned into the Smal site of pSV40193 and the orientation 
of the attP site was determined by DNA sequence analysis (plasmid 
pSV401 93attP). The gene encoding puromycin resistance (Puro) was 
isolated by digesting the plasmid pPUR (Clontech Laboratories, Inc. Palo 

25 Alto, CA) with AgeMBamHX followed by filling in the overhangs with 

Klenow and subsequently cloned into the Asc\ site downstream of the 
attP site of pSV40193attP generating the plasmid 
pSV40193attPsensePUR (Figure 4; SEQ ID NO: 113)). 
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The plasmid pSV401 93attPsensePUR was digested with Seal and 
co-transfected with the plasmid pFK161 (SEQ ID NO: 118) into mouse 
LMtk- cells and platform artificial chromosomes were identified and 
isolated as described above. The process for generating this exemplary 
5 platform ACes containing multiple site-specific recombination sites is 
summarized in Figure 5. One platform ACes resulting from this 
experiment is designated B19-18. This platform ACes chromosome may 
subsequently be engineered to contain target gene expression nucleic 
acids using the lambda integrase mediated site-specific recombination 
10 system as described herein in Example 7 and 8. 

EXAMPLE 4 

Lambda integrase mediated site-specific recombination of a RFP 
expressing vector onto artificial chromosomes 

In this example, a vector expressing the red fluorescent protein 

1 5 (RFP) was produced and recombined into the attP site residing on an 

artificial chromosome within LMTK- cells. This recombination is depicted 

in Figure 7. 

A. Construction of expression vectors containing wildtype and 
mutant lambda integrase 

20 Mutations at the glutamic acid at position 1 74 in the lambda 

integrase protein relaxes the requirement for the accessory protein IHF 

during recombination and DNA supercoiling in vitro (see, Miller et al. 

(1980) Ce// 20:721-729; Lange-Gustafson et al. (1984) J. Biol. Chem. 

255:12724-12732). Mutations at this site promote attP, attB 
25 intramolecular recombination in mammalian cells (Lorbach et al. (2000) J. 

Mol. BioI2S6-A 175-1 181). 

To construct nucleic acid encoding the mutant, lambda integrase 

was PCR amplified from bacteriophage lambda DNA (cl857 \nd Sam 7; 

New England Biolabs) using the following primers: 
30 Lamintl (SEQ ID No. 3) 
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TTCGAATTCATGGGAAGAAGGCGAAGTCATGAGCG) 
Lamint2 (SEQ ID No. 4) 

(TTCGAATTCTTATTTGATTTCAATTTTGTCCCAC). 

The resulting PCR product was digested with EcoR I and cloned into the 
5 EcoR I site of pUC19. Lambda integrase was mutated at amino acid 
position 174 using QuikChange Site-Directed Mutagenesis Kit 
(Stratagene) and the following oligos (generating a glutamic acid to 
arginine change at position 174): 
LambdalNTE174R 
10 (SEQ ID No. 6) 

(CGCGCAGCAAAATCTAGAGTAAGGAGATCAAGACTTACGGCTGACG), 
LamintR174rev (SEQ ID No. 7) 

(CGTCAGCCGTAAGTCTTGATCTCCTTACTCTAGATTTTGCTGCGCG). 
The resulting site directed mutant was confirmed by sequence analysis. 

15 The wildtype and mutant lambda genes were cloned into the EcoR I site 
of pCX creating pCX-Lamlnt (SEQ ID NO: 127) and pCXLamlntR (Figure 
8; SEQ ID NO: 112). 

The plasmid pCX (SEQ ID No. 70) was derived from plasmid 
pCXeGFP (SEQ ID No. 71). Excision of the EcoRI fragment containing the 

20 eGFP marker generated pCX. To generate plasmid pCXLamlNTR (SEQ ID 
NO: 112) an EcoRI fragment containing the lambda integrase E174R (SEQ 
ID No. 37) mutation was cloned into the EcoRI site of pCX, and to 
generate plasmid pCX-LamINT, an EcoRI fragment containing the wild- 
type lambda integrase was cloned into the EcoRI site of pCX. 

25 B. Construction of integration vector containing attB and DsRed 

The plasmid pDsRedNI (Clontech Laboratories, Palo Alto, CA; SEQ 
ID No. 29) was digested with Hpa I and ligated to the following annealed 
oligos: 

attB1 (SEQ ID No. 8) 
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(TGAAGCCTGCTTTTTTATACTAACTTGAGCGAA) 
attB2 (SEQ ID No. 9) 

(TTCGCTCAAGTTAGTATAAAAAAGCAGGCTTCA) 

The resulting vector < P DsRedN1-attB) was confirmed by PCR and 

5 sequence analysis. 

C Transfection into LMtk- cells 

LM(tk-) cells containing the Prototype A ACes (L1-18; Chromos 
Molecular Systems Inc., Burnaby, BC Canada) were co-transfected wrth 
r D sRedN1 or pDsRedNI -attB and either pCXLamlnt (SEQ ID NO: 127, or 
1 0 pCXLam.ntR (SEQ ID NO: 1 1 2) using Lipofectamine Plus Reagent 

(UfeTechno.ogies, Gaithersburg, MD>. The transfected cells were grown 
in DWIEM (LifeTechnologies, Gaithersburg, MD) with 10% FBS (CanSera) 
and G418 (CalBiochem) at a concentration of 1 mg/ml. 
D Enrichment by cell sorting 

The transfected cells were sorted using a FACs Vantage SE ce.l 
sorter (Becton Dickenson) to enrich for ceHs expressing DsRed. The cells 
were excited with a 488 nm Argon laser at 200 watts and cells 
fluorescing in the 585/42 detection channel were collected. The sorted 
cells were returned to growth medium for recovery and expans.on After 
20 three successive enrichments for cells expressing DsRed, single cel. 
sorting into 96 well plates was performed using the same parameters. 
Duplicate plates of the single cel. Cones were made for PCR ana.ys.s. 
E PCR analysis of single cell clones 

Pools of cells from each row and column of the 96 well plate were 
25 used for DNA isolation. DNA was prepared using a Wizard Genomic DNA 
purification kit (Promega .no, Madison, W«). Nested PCR analysis on the 
DNA pools was performed to confirm the site-specific recombinat.on 
event using the following primer sets: 



15 
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attPdwn2 (SEQ ID No. 10) 
(TCTTCTCGGGCATAAGTCGGACACC) 

CMVen (SEQ ID No. 11) 
(CTCACGGGGATTTCCAAGTCTCCAC) 

5 followed by: 

attPdwn (SEQ ID No. 12) 
(CAGAGGCAGGGAGTGGGACAAAATTG) 

CMVen2 (SEQ ID No. 13) 
(CAACTCCGCCCCATTGACGCAAATG). 

10 The resulting PCR reactions were analyzed by gel electrophoresis and the 
potential individual clones containing the site-specific recombination event 
were identified by combining the PCR results of all of the pooled rows 
and columns for each 96 well plate. The individual clones were then 
further analyzed by PCR using the following primers that flank the 

1 5 recombination junction. L1for and F1 rev flank the attR junction whereas 

REDfor and L2rev flank the attL junction (see Figure 7): 

L1for (SEQ ID No. 14) 
AGTATCGCCGAACGATTAGCTCTTCA 

F1rev (SEQ ID No. 15) 
20 G C C G A TTT C G G C CT ATTG G TT A A A 

REDfor (SEQ ID No. 16) 
CCGCCGACATCCCCGACTACAAGAA 

L2rev (SEQ ID No. 17) 
TTCCTTCGAAGGGGATCCGCCTACC. 

25 F. Sequence analysis of recombination junctions 

PCR products spanning the recombination junction were Topo- 

cloned into pcDNA3.1 D/V5His (Invitrogen Inc., San Diego, CA) and then 

sequenced by cycle-sequencing. The clones were confirmed to have the 

correct attR and attL junctions by cycle sequencing. 

30 G. Fluorescent In Situ Hybridization (FISH) 

The cell lines containing the correct recombination junction 

sequence were further analyzed by fluorescent in sftu hybridization (FISH) 
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by probing with the DsRed coding region labeled with biotin and 
visualizing with the Tyramide Signal Amplification system (TSA; NEN Life 
Science Products). The results indicate that the RFP sequence is present 
on the ACes. 
5 H. Southern analysis 

Genomic DNA was harvested from the cell lines containing an 
ACes with the correct recombinant event and digested with EcoR I. The 
digested DNAs were separated on a 0.7% agarose gel, transferred and 
fixed to a nylon membrane and probed with RFP coding sequences. The 
10 result showed that there is an integrated copy of RFP coding sequence in 
each clone. 

EXAMPLE 5 

Delivery of a second gene encoding GFP onto the RFP platform ACes 

A. Construction of integration vector containing attB and GFP 
1 5 (pD2eGFPIresPuroattB) . 

The plasmid plRESpuro2 {Clontech, Palo Alto, CA; SEQ ID NO: 88) 
was digested with EcoR] and Not\ then ligated to the D2eGFP EcoR\-Not\ 
fragment from pD2eGFP-N1 (Clontech, Palo Alto, CA) to create 
pD2eGFPIresPuro2. Subsequently, oligos encoding the attB site were 
20 annealed and ligated into the Nru\ site of pD2eGFPIresPuro2 to create 
pD2eGFP!resPuroattB. The orientation of attB in the Nru\ site was 
determined by PCR. 

B. Transfection of LMtk- cells 

The LMtk- cells containing the RFP platform ACes produced in 
25 Example 4, which has multiple attP sites, were co-transfected with 
pCXLamlntR and pD2eGFPIresPuroattB using LipofectAMINE PLUS 
reagent. Five jjq of each vector was placed into a tube containing 750 //I 
of DMEM (Dulbecco's modified Eagles Medium). Twenty jj\ of the Plus 
reagent was added to the DNA and incubated at room temperature for 1 5 
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minutes. A mixture of 30 jj\ of lipofectamine and 750 fA DMEM was 
added to the DNA mixture and incubated an additional 1 5 minutes at 
room temperature. The DNA mixture was then added dropwise to 
approximately 3 million cells attached to a 10cm dish in 5 mis of DMEM. 
5 The cells were incubated 4 hours (37°C, 5% C0 2 ) with the DNA-lipid 
mixture, after which DMEM with 20% fetal bovine serum was added to 
the dishes to bring the culture medium to 10% fetal bovine serum. The 
dishes were incubated at 37°C with 5% CQ 2 . 

Plasmid P D2eGFPIresPuroattB has a puromycin gene 
10 transcriptionally linked to the GFP gene via an IRES element. Two days 
after the transfection the cells were placed in medium containing 
puromycin at 4//g/ml to select for cells containing the 

P D2eGFPIresPuroattB plasmid integrated into the genome. Twenty-three 
clones were isolated after 17 days of selection with puromycin. These 
1 5 clones were expanded and then analyzed for the presence of the GFP 
gene on the ACes by 2-color (RFP/biotin & GFP/digoxigenin> TSA-FISH 
(NEN) according to the manufacturers protocol. Sixteen of the 23 clones 
produced a positive FISH signal on the ACes with a GFP probe. 

EXAMPLE 6 

20 Delivery Of ACes Into human Mesenchymal Stem Cells (hMSC) 
A. Transfection 

Transfection conditions for the most efficient delivery of the ACes 
into hMSCs (Cambrex BioWhittaker Product Code PT-2501 , lot# F0658, 
East Rutherford, New Jersey) were assayed using LipofectAMINE PLUS 
25 and Superfect. One million prototype B ACes, which is a murine derived 
60Mb ACes having primarily murine pericentric heterochromatin, and 
carrying a "payload" containing a hygromycin B selectable marker gene 
and a lacT. reporter gene {see , Telenius et al., 1999, Chrom. Res. , 7:3-7 
and Kereso et al., 1996. Chrom. Res. , 4:226-239; each of which is 
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incorporated herein by reference in its entirety), were combined with 1-12 
Ml of the transfection agent. In the case of LipofectAMINE PLUS, the 
PLUS reagent was combined with the ACes for 15 minutes followed by 
LipofectAMINE for a further 1 5 minutes. Superfect was complexed for 
10 minutes at a ratio of 2//I Superfect per 1 million ACes. The 
^Ces/transfection agent complex was then applied to 0.5 million recipient 
cells and the transfection was allowed to proceed according to the 
manufacturer's protocol. Percent transfected cells was determined on a 
FACS Vantage flow cytometer with argon laser tuned to 488 nm at 
200mW and FITC fluorescence collected through a standard FITC 530/30 
nm band pass filter. After 24 hours, IdUrd labeled ACes were delivered 
to human MSCs in the range of 30-50%, varying with transfection agent 
and dose. ACes delivery curves were generated from data collected in 
experiments that varyied the dose of the transfection reagents. Dose 
response curves of Superfect and LipofectAMINE PLUS, showing delivery 
of ACes into recipient hMSCs cells, were prepared, measured by transfer 
of IdUrd labeled ACes and detected by flow cytometry. Superfect shows 
maximum delivery in the range of 30-50% at doses greater than 2 /vl per 
million ACes. LipofectAMINE PLUS has a 42-48% delivery peak around 
5-8 //I per million ACes. These dose curves were then correlated with 
toxicity data to determine the transfection conditions that will allow for 
highest potential transfection efficiency. Toxicity was determined by a 
modified p.ating efficiency assay (de Jong et a.., 2001. Chrom. Researc h, 
9-475-485). The population's normalized plating efficiency (at max.mum 
% delivery doses) was in the range of 0.2 - 0.4 for Superfect and 0.5 - 
0.6 with LipofectAMINE PLUS. 

Due to the transfected population consisting of mixed cell types, 
flow cytometry allowed for the assessment of ACes delivery into each 
sub-population and the purification of the target population. Flow profiles 
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showing forward scatter (cell size) and side scatter (internal cel. 
granularity) revealed three distinct hMSC populations that were gated mto 
three regions: R3 (small cel. region), R4 (medium cell region), R5 (large 
cel. region). Transfection conditions were further optimized by re- 
3 ana.yzing de.ivery curves and assessing the differences in de.ivery to each 
sub- P o P u.ation. Dose response curves of Superfect and LipofectAMINE 
were prepared showing % delivery to each sub-population represented by 
the gating on basis of cell size and granularity properties of the m.xed 
population. Three distinct hMSC populations were gated and % del.very 
O dose curves generated. Using Superfect and LipofectAMINE PLUS the 

overaH % de.ivery increased with ce» size (80-90% de.ivery in .arge cells). 
LipofectAMINE PLUS at high doses (8-1 2 „. per 1 million ACes) shows an 
increase in the overall proportion of chromosome transfer to the small 
population (10-20%). This suggests an advantage to using th.s 
5 transfection agent if the smal.-undifferentiated cel. population is the 

desired target host cell. 

B Expression from Genes on ACes IN hMSCs 

Following the delivery screening process conducted in section (A) 

above, the most promising results were subjected to further analyses to 

ZO monitor expression and verify the presence of structurally intact ACes. 

The transfection conditions employed for these experiments were exactly 
the same as those that had been used during the screening process. 
Short-term expression was monitored by transfecting hMSCs with ACes 
containing a RFP gene (red fluorescent protein) set forth in Example 2C as 

25 »D1 1 C4". The unselected population was harvested at 72-96 hours post 
transfection and % positive fluorescent cells measured by flow 
cytometry. RFP expression was in the range of 1-20%. 

Long term-gene expression was assayed by selecting for 
hygromycin B resistant cells over a period of 7-10 days. Cytogenet.c 
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analysis was done to detect presence of intact ACes by Fluorescent in 
Situ hybridization (FISH), where metaphase chromosomes were hybridized 
to a mouse major satellite-DNA probe (targeting murine pericentric 
heterochromatin) and a lambda probe (hybridizing to the iacZ gene). The 
5 human mesenchymal transfected culture could not undergo standard sub- 
cloning as diffuse colonies form with limited doublings available for 
expansion. Cytogenetic analysis was performed on the entire population, 
sampling over a period of 3-10 days post-transfection. The hygromycin 
resistant population was then blocked in mitosis with colchicine and 

10 analyzed for presence of intact ACes by FISH. Preliminary FISH results 
show approximately 2-8% of the hMSC-transfected population had an 
intact ACes. This compared to rat skeletal muscle myoblast clones, 
which were in the range of 60-95%. To increase the % of intact ACes in 
the hMSC-transfected population an enrichment step can be utilized as 

15 described in Example 2C. 

C. Differentiation of The hMSCs 

In initial experiments where transfected hMSCs cells have been 
induced to differentiate into adipose or osteocytes, the results indicate 
that the transfected cells appear to be differentiating at a rate comparable 

20 to the untransfected controls and the cultures are lineage specific as 

tested by microscopic examination, FISH, Oil Red O staining (adipocyte 
assay), and calcium secretion (osteocyte assay). 

Accordingly, these results indicate that the artificial chromosomes 
{ACes) provided herein can be successfully transferred into hMSC target 

25 cells. Targeting MSCs (such as hMSCs) permits gene transfer into cells in 
an undifferentiated state where the cells are easier to expand and purify. 
The genetically modified cells can then be differentiated in vitro or 
injected into a site in vivo where the microenvironment will induce 
transformation into specific cell lineages. 



WO 02/097059 



PCT/US02/17452 



-117- 
EXAMPLE 7 

Delivery of a Promoterless Marker Gene to a Platform ACes 

Platform ACes containing pSV40attPsensePURO (Figure 4) were 
constructed as set forth in Examples 3 and 4. 
5 A. Construction of Targeting Vectors. 

The base vector p18attBZeo (3166bp; SEQ ID NO: 114) was 
constructed by ligating the 1067bp Hfnd\\\-Ssp\ fragment containing 
attBZeo, obtained from pLITattBZeo (SEQ ID NO:91), into pUC18 (SEQ ID 
NO: 122) digested with H/nd\\\ and Ssp\. 
10 1. p18attBZEO-eGFP (61 1 9bp; SEQ ID NO: 126) was constructed 

by inserting the 2977bp Spe\-Hfnd\\\ fragment from pCXeGFP (SEQ ID 
NO:71; Okabe, eta/. (1997) FEBS Lett 407:31 3-31 9) containing the eGFP 
gene into p18attBZeo (SEQ ID NO: 114) digested with Hind\\\ and Xba\. 

2. P 18attBZEO-5'6XHS4eGFP (Figure 10; 7631 bp; SEQ ID NO: 
15 116) was constructed by ligating the 4465bp Hind\\\ fragment from 

pCXeGFPattB(6XHS4)2 (SEQ ID NO: 123) which contains the eGFP gene, 
under the regulation of the chicken beta actin promoter, 6 copies of the 
HS4 core element located 5' of the chicken beta actin promoter and the 
polyadenylation signal into the Hind\\\ site of p18attBZeo (SEQ ID NO: 
20 114). 

3. p18attBZEO-3'6XHS4eGFP (Figure 11; 7600bp; SEQ ID NO: 
115) was created by removing the 5'6XHS4 element from p18attBZeo- 
(6XHS4)2eGFP (SEQ ID NO: 1 10). p1 8attBZeo-(6XHS4)2eGFP was 
digested with EcoRV and Spe\, treated with Klenow and religated to form 

25 p18attBZeo3'6XHS4eGFP (SEQ ID NO: 115). 

4. p18attBZEO-(6XHS4)2eGFP (Figure 12; 9080bp; SEQ ID NO: 
1 10) was created in two steps. First, the EcoR\-Spe\ fragment from 
pCXeGFPattB{6XHS4)2 (SEQ ID NO: 123) which contains 6 copies of the 
HS4 core element was ligated into p18attBZeo (SEQ ID NO: 1 14) 
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digested with EcoR\ and Xba\ to create p1 8attBZeo6XHS4 (461 5bp; SEQ 
ID NO: 117). Next, p1 8attBZeo6XHS4 was digested with Hin6\\\ and 
ligated to the 4465bp HindW fragment from P CXeGFPattB(6XHS4)2 
which contains the eGFP gene, under the regulation of the chicken beta 
actin promoter. 6 copies of the HS4 core element located 5' of the 
chicken beta actin promoter and the polyadenylation signal. 

Table 2 



10 



Targeting plasmid 


No. zeocin 

resistant 

clones 


No. clones with 
expected PCR 
product size 


No. clones with correct 
sequence at 
recombination junction 


p1 8attBZEOeGFP 


12 


12 


NT* 


p1 8attBZEO-5'6XHS4eGFP 


11 


11 


NT 


p1 8attBZEO-3'6XHS4eGFP 


11 


11 


NT 


p1 8attBZEO-(6XHS4)2eGFP 


9 


9 


4/4 



15 



B. Transfection and Selection with Drug. 

The mouse cell line containing the 2 nd generation platform ACE, 
B19-38 (constructed as set forth in Example 3), was plated onto four 
10cm dishes at approximately 5 million cells per dish. The cells were 
incubated overnight in DMEM with 10% fetal calf serum at 37°C and 5% 
CO a . The following day the cells were transfected with 5j/g of each of 
20 the 4 vectors listed in Example 7.A. above and Bfjg of pCXLamlntR (SEQ 
ID NO: 112), for a total of 10//g per lOcm dish. Lipofectamine Plus 
reagent was used to transfect the cells according to the manufacturers 
protocol. Two days post-transfection zeocin was added to the medium at 
500ug/ml. The cells were maintained in selective medium until colonies 
25 formed. The colonies were then ring-cloned (see, e.g., McFarland, 2000, 
Methods Cell Sci, Mar;22(1):63-66). 
C. Analysis of Clones (PCR, SEQUENCING). 
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Genomic DNA was isolated from each of the candidate clones with 
the Wizard kit (Promega) and following the manufacturers protocol. The 
following primer set was used to analyze the genomic DNA isolated from 
the zeocin resistant clones: 5PacSV40 
5 CTGTTAATTAACTGTGGAATGTGTG TCAGTTAGGGTG (SEQ ID NO: 76); 
Antisense Zeo - TGAACAGGGTCACGTCGTCC (SEQ ID NO:77). PCR 
amplification with the above primers and genomic DNA from the site- 
specific integration of any of the 4 zeocin vectors would result in a 673bp 
PCR product. 

10 As set forth in Table 2, of the 4 zeocin resistant candidate clones 

thusfar analyzed by PCR, all 4 exhibit the correct sequence for a site- 
specific integration event. 

EXAMPLE 8 

Integration of a PCR product by site-specific recombination. 
15 In this example a gene is integrated onto the platform ACes by site- 
specific recombination without cloning said gene into a vector . 
A. PCR PRIMER DESIGN. 

PCR primers are designed to contain an attB site at the 5' end of 
one of the primers in the primer set. The remaining primers, which could 
20 be one or more than one primer, do not contain an attB site, but are 

complementary to sequences flanking the gene or genes of interest and 
any associated regulatory sequences. In first example, 2 primers (one 
containing an attB site) are used to amplify a selective gene such as 
puromycin. 

25 In a second example as shown in Figure 1 3, the primer set includes 

primers 1 & 2 that amplify the GFP gene without amplification of an 
upstream promoter. Primer 1 contains the attB site at the 5' end of the 
oligo. Primers 3 & 4 are designed to amplify the IRES-blasticidin DNA 
sequences from the vector pIRESblasticidin. The 5'end of primer 3 
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contains sequences complementary to the 5' end of primer 2 such that 
annealing can occur between 5' ends of the two primers. 

B. PCR REACTION AND SUBSEQUENT LIGATION TO CREATE 
CIRCULAR MOLECULES FROM THE PCR PRODUCT 

5 In the first example set forth above in Section A, the two PCR 

primers are combined with a puromycin DNA template such as pPUR 
(Clontech), a heat stable DNA polymerase and appropriate conditions for 
DNA amplification. The resulting PCR product (attB-Puromycin) is then 
then purified and self-ligated to form a circular molecule. 

10 In the second example set forth above in Section A, amplification 

of the GFP gene and IRES-blasticidin sequences is accomplished by 
combining primers 1 & 2 with DNA template pD2eGFP and primers 3 & 4 
with template pIRESblasticidin under appropriate conditions to amplify the 
desired template. After initial amplification of the two products (attB-GFP 

15 & IRES-blasticidin) in separate reactions, a second round of amplification 
using both of the PCR products from the first round of amplification 
together with primers 1 and 4- amplifies the fusion product attB-GFP-IRES- 
blasticidin (Figure 13). This technique of using complementary sequences 
in primer design to create a fusion product is employed in Saccharomyces 

20 cerevisfae for allele replacement (Erdeniz et al (1 997) Gen Res 7:1 1 74- 
1 183). The amplified product is then purified from the PCR reaction 
mixture by standard methods and figated to form a circular molecule. 

C. INTRODUCTION OF PCR PRODUCT ONTO THE ACes USING A 
RECOMBINASE 

25 The circular PCR product is then be introduced to the platform 

ACes using the bacteriphage lambda integrase E174R. The introduction 
can be performed in vivo by transfecting the pCXLamlntR {SEQ ID NO: 
112) vector encoding the lambda integrase mutant E174R together with 
the circularized PCR product into a cell line containing the platform ACE. 
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D. SELECTION FOR MARKER GENE 

The marker gene (in this case either puromycin, blasticidin or GFP) 
is used to enrich the population for cells containing the proper integration 
event. A proper integration event in the second example (Figure 14) 
juxtaposes a promoter residing on the platform ACes 5' to the attB-GFP- 
IRES-Blasticidin PCR product, allowing for transcription of both GFP and 
blasticidin. If enrichment is done by drug selection, blasticidin is added to 
the medium on the transfected cells 24-48 hours post-transfection. 
Selection is maintained until colonies are formed on the plates. If 
enrichment is done by cell sorting, cells are sorted 2-4 days post- 
transfection to enrich for ceils expressing the fluorescent marker (GFP in 
this case). 

E. ANALYSIS OF CLONES 

Clonal isolates are analyzed by PCR, FISH and sequence analysis to 
confirm proper integration events. 

EXAMPLE 9 

Construction of a human platform ACes "ACE 0.1" 

A. CONSTRUCTION OF THE TARGETING VECTOR pPACrDIMA 

Genome Systems (IncyteGenomics) was supplied with the primers 
5'HETS (GGGCCGAAACGATCTCAACCTATT; SEQ ID NO:78), and 
3'HETS (CGCAGCGGCCCTCCTACTC; SEQ ID NO:79), which were used 
to amplify a 538bp PCR product homologous to nt 9680-10218 of the 
human rDNA sequences (GenBank Accession No. U13369 ) and used as a 
probe to screen a human genomic P1 AC (P1 Artificial Chromosome) 
library constructed in the vector pCYPAC2 (loannou eta/. (1994) Nat. 
Genet. 6(1): 84-89). Genome Systems clone #18720 was isolated in this 
screen and contains three repeats of human rDNA as assessed by 
restriction analysis. GS clone #18720, was digested with Pmel, a 
restriction enzyme unique to a single repeat of the human rDNA (45Kbp), 
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and then refigated to form pPACrDNA (Figure 15). The insert in 
pPACrDNA was analyzed by restriction digests and sequence analysis of 
the 5' and 3' termini. The pPACrDNA, rDNA sequences are homologous 
to Genbank Accession #U13369, containing an insert of about 45 kB 
5 comprising a single repeat beginning from the end of one repeat at 

— 33980 (relative to the Genbank sequence) through the beginning of the 
next repeat up to approximately 35120 (the repeat offset from that listed 
in the GenBank file). Thus, the rDNA sequence is just over 1 copy of the 
repeat extending from 33980 (+/-10bp) to the end of the first repeat 
10 (43Kbp) and continuing into the second repeat to bp 35120 (+/-10bp). 

B. TRANSFECTION AND ACes FORMATION. 

Five hundred thousand MSU1.1 cells (Morgan et a!., 1991, Exp. 
Cell Res., Nov;1 97(1):1 25-1 36; provided by Dr. Justin McCormick at 
Michigan State University) were plated per 6cm plate (3 plates total) and 

1 5 allowed to grow overnight. The cells were 70-80% confluent the 
following day. One plate was transfected with 1 5//g pPACrDNA 
(linearized with Pme I) and 2//g pSV40attPsensePuro (linearized with Sea 
I; see Example 3). The remaining plates were controls and were 
transfected with either 20//g pBS (Stratagene) or 20//g 

20 pSV40attBsensePuro (linearized with Sea I). All three plates were 
transfected using a CaP0 4 protocol. 

C. SELECTION OF PUROMYCIN RESISTANT COLONIES 

One day post-transfection the cells were "glycerol shocked" by the 
addition of PBS medium containing 10% glycerol for 30 seconds. 
25 Subsequently, the glycerol was removed and replaced with fresh DMEM. 
Four days post-transfection selective medium was added. Selective 
medium contains lug/ml puromycin. The transfection plates were 
maintained at 37°C with 5% C0 2 in selective medium for 2 weeks at 
which point colonies could be seen on the plate transfected with 
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pPACrDNA and pSV40attPsensePuro. The colonies were ring-cloned 
from the plate on day 1 7 post-selection and expanded in selective 
medium for analysis. Only two colonies (M2-2d & M2-2b) were able to 
proliferate in the selective medium after cloning. No colonies were seen 
5 on the control plates after 37 days in selective medium. 
D. ANALYSIS OF CLONES 

FISH analysis was performed on the candidate clones to detect 
ACes formation. Metaphase spreads from the candidate clones were 
probed in multiple probe combinations. In one experiment, the probes 

lO used were biotin-labeled human alphoid DNA (pPACrDNA) and 

digoxlgenin-labeled mouse major DNA (pFK161) as a negative control. 
Candidate M2-2d was single cell subcloned by flow sorting and the 
candidate subclones were reanalyzed by FISH. Subclone 1 B1 of M2-2d 
was determined to be a platform ACes and is also designated human 

15 Platform ACE O.I. 

EXAMPLE 10 

Site-specific integration of a marker gene onto a human platform ACE 0.1 

The promoterless delivery method was used to deliver a 
promoterless blasticidin marker gene onto the human platform ACes with 

20 excellent results. The human ACes platform with a promoterless 

blasticidin marker gene resulted in 21 of 38 blasticidin resistant clones 
displaying a PCR product of the expected size from the population co- 
transfected with pLIT38attBBSRpolyA1 0 and pCXLamlntR (Figure 8; SEQ 
ID NOs. 111 and 112). Whereas, the population transfected with 

25 pBlueScript resulted in 0 blasticidin resistant colonies. 

A. CONSTRUCTION OF P LIT38attB-BSRpolyA1 0 & pLIT38attB- 
BSRpolyA2. 

The vector pLITMUS 38 (New England Biolabs; U.S. Patent No. 
5,691,140; SEQ ID NO: 119) was digested with EcoRV and Hgated to 
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two annealed oligomers, which form an attB site (attB1 5'- 
TGAAGCCTGCTTTTTTATACTAACTTGAGCGAA-3' (SEQ ID NO:8); attB2 
5'- TTCGCTCAAGTTAGTATAAAAAAGCAGGCTTCA-3'; SEQ ID NO:9). 
This ligation reaction resulted in the vector pLIT38attB (SEQ ID NO: 120). 
5 The blasticidin resistance gene and SV40 polyA site was PGR amplified 
with primers: 5BSD (ACCATGAAAACATTTAACATTTCTCAACA; SEQ ID 
NO:80) and SV40polyA (TTTATTTGTGAAATTTGTGATGCTATTGC; SEQ 
ID NO:81) using pPAC4 (Frengen, E., eta/. (2000) Genomics 68 (2), 118- 
126; GenBank Accession No. U75992) as template. The blasttcidin- 

10 SV40polyA PGR product was then ligated into pLIT38attB at the BamHS 
site, which was Klenow treated following digestion with BamHl. 
pLIT38attB-BSDpolyA10 (SEQ ID NO: 111) and pLIT38attB-BSDpolyA2 
(SEQ ID NO: 121) are the two resulting orientations of the PCR product 
ligated into the vector. 

15 B. TRANSFECTION OF MSU1.1 CELLS CONTAINING HUMAN 
PLATFORM ACE 0.1. 

MSU1.1 cells containing human platform ACE 0.1 (see Example 9) 
was expanded and plated to five 10cm dishes with 1.3x10 6 cells per dish. 
The cells were incubated overnight in DMEM with 10% fetal bovine 

20 serum, at 37°C and 5% C0 2 . The following day the cells were 

transfected with 5//g of each plasmid as set forth in Table 3, for a total of 
10^/g of DNA per plate of cells transfected (see Table 3) using ExGen 500 
in vitro transfection reagent (MBI fermentas, cat. no. R051 1). The 
transfection was performed according to the manufacturers protocol. 

25 Cells were incubated at 37°C with 5% C0 2 in DMEM with 10% fetal 
bovine serum following the transfection. 
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Table 3 



Plate # 


Plasmid 1 


Plasmid 2 


No. Bsd R Colonies 


1 


pBS 


None 


0 


2 


pCXLamlnt 


pLIT38attB- 
BSRpolyAlO 


16 


3 


pCXLamlntR 


pLlT38attB- 
BSRpolyAlO 


40 


4 


pCXLamlnt 


pLIT38attB- 
BSRpolyA2 


28 


5 


pCXLamlntR 


pLIT38attB- 
BSRpolyA2 


36 



10 C. SELECTION OF BLASTICIDIN RESISTANT CLONES. 

Three days following the transfection the cells were split from a 1 0 
cm dish to two 15cm dishes. The cells were maintained in DMEM with 
10% fetal bovine serum for 4 days in the 15 cm dishes. Seven days 
post-transfection blasticidin was introduced into the medium. Stably 

15 transfected cells were selected with 1/yg/ml blasticidin. The number of 
colonies formed on each plate is listed in Table 3. These colonies were 
ring-cloned and expanded for PCR analysis. Upon expansion in blasticidin 
containing medium some clones failed to live and therefore do not have 
corresponding PCR data. 

20 D. PCR ANALYSIS 

Thirty-eight of the 40 clones from plate 3 grew after ring-cloning. 
Genomic DNA was isolated from these clones with the Promega Wizard 
Genomic cDNA purification kit, digested with EcoRI and used as template 
in a PCR reaction with the following primers: 3BSP - TTAATTTCGGG 

25 TATATTTGAGTGGA (SEQ ID NO:82); 5PacSV40 - 

CTGTTAATTAACTGTGGAA TGTGTGTCAGTTAGGGTG (SEQ ID NO:76). 
The PCR conditions were as follows- 100ng of genomic DNA was 
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amplified with 0.5ul Herculase polymerase (Stratagene) in a 50ul reaction 
that contained 12.5pmole of each primer, 2.5mMof each dNTP, and 1X 
Herculase buffer (Stratagene). The reactions were placed in a PerkinElmer 
thermocycler programmed as follows: Initial denaturation at 95°C for 10 
5 minutes; 35 cycles of 94°C for 1 minute, 53°C for 1 minute, 72°C for 1 
minute, and 72°C for 1 minute; Final extension for 10 minutes at 72°C; 
and 4°C hold. If pUT38attB-BSRpolyA1 0 integrates onto the human 
platform ACE 0.1 correctly, PCR amplification with the above primers 
should yield an 804bp product. Twenty-one of the 38 clones from plate 
10 3 produced a PCR product of the expected 804bp size. 

EXAMPLE 1 1 

Delivery of a Vector comprising a Promoterless Marker Gene and a gene 
encoding a therapeutic product to a Platform ACes 

Platform ACes containing pSV40attPsensePURO (Figure 4) were 

15 constructed as set forth in Examples 3 and 4. 

A. CONSTRUCTION OF DELIVERY VECTORS 

1. Erythropoietin cDNA vector, p18EPOcDNA. 
The erythropoietin cDNA was PCR amplified from a human cDNA 
library (E. Perkins et a/., 1999, Proc. Natl. Acad. ScL USA 36(5): 2204- 

20 2209) using the following primers: EPOSXBA - 

TATCTAGAATGGGGGTGC ACGAATGTCCTGCC (SEQ ID NO: 83); 
EP03BSI - TACGTACGTCATC TGTCCCCTGTCCTGCAGGC (SEQ ID NO: 
84). The cDNA was amplified through two successive rounds of PCR 
using the following conditions: heat denaturation at 95°C for 3 minutes; 

25 35 cycles of a 30 second denaturation (95°C), 30 seconds of annealing 
(60°C), and 1 minute extension (72°C); the last cycle is followed by a 7 
minute extension at 72°C. BIO-X-ACT (BIOLINE) was used to amplify the 
erythropoietin cDNA from 2.5ng of the human cDNA library in the first 
round of amplification- Five jj\ of the first amplification product was used 
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as template for the second round of amplification. Two PCR products 
were produced from the second amplification with Taq polymerase 
(Eppendorf), each product was cloned into pCR2.1-Topo (Invitrogen) and 
sequenced. The larger PCR product contained the expected cDNA 
5 sequence for erythropoietin. The erythropoietin cDNA was moved from 
pTopoEPO into p1 8attBZeo(6XHS4)2eGFP (SEQ ID NO: 110). pTopoEPO 
was digested with BsiWI and Xbal to release a 588 bp EPO cDNA. BsrGI 
and BsiWI create compatable ends. The eGFP gene was removed from 
p18attBZeo(6XHS4)2eGFP by digestion with BsiWI and Xbal, the 8.3 Kbp 

10 vector backbone was gel purified and ligated to the 588 bp EPO cDNA to 
create p18EPOcDNA (SEQ ID NO: 124). 

2. Genomic erythropoietin vector, p18genEPO. 
The erythropoietin genomic clone was PCR amplified from a human 
genomic library (Clontech) using the following primers: GENEP03BSI - 

15 CGTACGTCATCTGTCCCCT GTCCTGCA (SEQ ID NO: 85); GENEPO 

5XBA -TCTAGAATGGGGGT GCACGGTGAGTACT (SEQ ID NO: 86). The 
reaction conditions for the amplification were as follows: heat 
denaturation for 3 minutes (95°C); 30 cycles of a 30 second denaturation 
(95°C), 30 seconds annealing (from 65°C decreasing 0.5°C per cycle to 

20 50°C), and 3 minutes extension (72°C); 15 cycles of a 30 second 
denaturation <95°C), 30 seconds annealing (50°C), and 3 minute 
extension (72°C); the last cycle is followed by a 7 minute extension at 
72°C. The erythropoietin genomic PCR product (2147 bp) was gel 
purified and cloned into pCR2.1Topo to create pTopogenEPO. Sequence 

25 analysis revealed 2bp substitutions and insertions in the intronic 

sequences of the genomic clone of erythropoietin. A partial digest with 
Xbal and complete digest with BsiWI excised the erythropoietin genomic 
insert from pTopogenEPO. The resulting 2158 bp genomic erythropoietin 
fragment was ligated into the 8.3 Kbp fragment resulting from the 
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digestion of p1 8attBZeo(6XHS4)2eGFP (SEQ ID NO: 1 10) with Xbal and 
BsrGI to create p18genEPO (SEQ ID NO: 125). 

B. TRANSFECTION AND SELECTION WITH DRUG 

The erythropoietin genomic and cDNA genes were each moved 
5 onto the platform ACes B19-38 (constructed as set forth in Example 3) by 
co-transfecting with pCXLamlntR. Control transfections were also 
performed using pCXLamlnt (SEQ ID NO: 127) together with either 
p18EPOcDNA (SEQ ID NO: 124) or p18genEPO (SEQ ID NO: 125). 
Lipofectamine Plus was used to transfect the DNA's into B19-38 cells 
10 according to the manufacturer's protocol. The cells were placed in 
selective medium (DMEM with 10% FBS and Zeocin @ 500ug/ml) 48 
hours post-transfection and maintained in selective medium for 13 days. 
Clones were isolated 15 days post-transfection. 

C. ANALYSIS OF CLONES (ELISA, PCR) 
15 1 - ELISA Assays 

Thirty clones were tested for erythropoietin production by an ELISA 
assay using a monoclonal anti-human erythropoietin antibody (R&D 
Systems, Catalogue # MAB287), a polyclonal anti-human erythropoietin 
antibody (R&D Systems, Catalogue # AB-286-NA) and alkaline 

20 phosphotase conjugated goat-anti-rabbit IgG (heavy and light chains) 

(Jackson Immuno Research Laboratories, Inc., Catalogue # 111-055-144). 
The negative control was a Zeocin resistant clone isolated from B19-38 
cells transfected with p1 8attBZeo(6XHS4) (SEQ ID NO: 1 17; no insert 
control vector) and pCXLamlntR (SEQ ID NO: 112). The preliminary 

25 ELISA assay was executed as follows: 1) Nunc-lmmuno Plates (MaxiSorb 
96-well, Catalogue # 439454) were coated with 75ul of a 1/200 dilution 
(in Phosphate buffered Saline, pH 7.4 (PBS), Sigma Catalogue # P-3813) 
of monoclonal anti-human erythropoietin antibody overnight at 4°C. 2) 
The following day the plates were washed 3 times with 300ul PBS 
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containing 0.15% Tween 20 (Sigma, Catalogue # P-9416). 3) The plates 
were then blocked with 300ul of 1 % Bovine Serum Albumin (BSA; Sigma 
Catalogue # A-7030) in PBS for 1 hour at 37°C. 4) Repeat the washes as 
in step 2. 5) The clonal supernatants (75ul per clone per well of 96-well 
5 plate) were then added to the plate and incubated for 1 hour at 37°C. 
The clonal supernatant analyzed in the ELISA assay had been maintained 
on the cells 7 days prior to analysis. 6) Repeat the washes of step 2. 7) 
Add 75ul of polyclonal anti-human erythropoietin antibody (1/250 dilution 
in dilution buffer (0.5% BSA, 0.01% Tween 20, 1X PBS, pH 7.4) and 

lO incubate 1 hour at 37°C. 8) Repeat washes of step 2. 9) Add 75ul of 

goat anti-rabbit conjugated alkaline phosphatase diluted 1 /4000 in dilution 
buffer and incubate 1 hour at 37°C. 10) Repeat washes of step 2. 11) 
Add 75ul substrate, p-nitrophenyl phosphate (Sigma N2640), diluted to 
1mg/ml in substrate buffer (0.1 Ethanolamine-HCI (Sigma, Catalogue # E- 

15 6133), 5mM MgCI2 (Sigma, Catalogue # M-2393), pH 9.8). Incubate the 
plates in the dark for 1 hour at room temperature (22°C). 12) Read the 
absorption at 405nm (reference wavelength 495nm) on an Universal 
Microplate Reader (Bio-Tek Instruments, Inc., model # ELX800 UV). The 
erythropoietin standard curve was derived from readings of diluted human 

20 recombinant Erythropoietin (Roche, catalogue # 1-120-166; dilution range 
125 - 7.8mUnits/ml). From this preliminary assay the 21 clones 
displaying the highest expression of erythropoietin were analyzed a 
second time in the same manner using medium supernatants that had 
been on the clones for 24 hours and a 1:3 dilution therof. 

25 2. PCR Analysis 

Genomic DNA was isolated from the 21 clones with the best 
expression (as assessed by the initial ELISA assay above) as well as the 
B19-38 cell line and used for PCR analysis. Genomic DNA was isolated 
using the Wizard genomic DNA purification kit (Promega) according to the 
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manufacturers protocol. Amplification was performed on lOOng of 
genomic DNA as template with MasterTaq DNA Polymerase (Eppendorf) 
and the primer set 5PacSV40 - CTGTTAATTAACTGTGGAATGTGTG 
TCAGTTAGGGTG (SEQ ID NO: 76) and Antisense Zeo - 
5 TGAACAGGGTCACGTCGTCC (SEQ ID NO: 77). The amplification 

conditions were as follows: heat denaturation for 3 minutes (95°C); 30 
cycles of a 30 second denaturation (95°C), 30 seconds annealing (from 
65oC decreasing 0.5oC per cycle to 50°C), and 1 minutes extension 
(72°C); 15 cycles of a 30 second denaturation (95°C), 30 seconds 

10 annealing <50°C), and 1 minute extension (72°C); the last cycle is 

followed by a 10 minute extension at 72°C. PCR products were size 
separated by gel electrophoresis. Of the 21 clones analyzed 19 produced 
a PCR product of 650 bp as expected for a site-specific integration event. 
All nineteen clones were the result of transformations with p19EPOcDNA 

15 (5) or p18genEPO (14) and pCXLamlntR (i.e. mutant integrase). The 
remaining two clones, both of which were the result of transformation 
with p18genEPO (SEQ ID NO: 125) and pCXLamlnt (i.e. wildtype 
integrase; SEQ ID NO: 127), produced a 400 bp PCR product. 

Example 12 

20 Preparation of a Transformation Vector Useful for the Induction of Plant 
Artificial Chromosome Formation 

Plant artificial chromosomes (PACs) can be generated by 
introducing nucleic acid, such as DNA, which can include a targeting 
DNA, for example rDNA or lambda DNA, into a plant cell, allowing the cell 
25 to grow, and then identifying from among the resulting cells those that 
include a chromosome with a structure that is distinct from that of any 
chromosome that existed in the cell prior to introduction of the nucleic 
acid. The structure of a PAC reflects amplification of chromosomal DNA, 
for example, segmented, repeat region-containing and heterochromatic 
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structures. It is also possible to select cells that contain structures that 
are precursors to PACs, for example, chromosomes containing more than 
one centromere and/or fragments thereof, and culture and/or manipulate 
them to ultimately generate a PAC within the cell. 
5 In the method of generating PACs, the nucleic acid can be 

introduced into a variety of plant cells. The nucleic acid can include 
targeting DNA and/or a plant expressable DNA encoding one or multiple 
selectable markers (e.g., DNA encoding bialophos (bar) resistance) or 
scorable markers {e.g., DNA encoding GFP). Examples of targeting DNA 

10 include, but are not limited to, N. tabacum rDNA intergenic spacer 

sequence (IGS) and Arabidopsis rDNA such as the 1 8S, 5.8S, 26S rDNA 
and/or the intergenic spacer sequence. The DNA can be introduced using 
a variety of methods, including, but not limited to Agrobacterium- 
mediated methods, PEG-mediated DNA uptake and electroporation using, 

15 for example, standard procedures according to Hartmann et al [(1998) 
Plant Molecular Biology 36:741]. The cell into which such DNA is 
introduced can be grown under selective conditions and can initially be 
grown under non-selective conditions and then transferred to selective 
media. The cells or protoplasts can be placed on plates containing a 

20 selection agent to grow, for example, individual calli. Resistant calli can 
be scored for scorable marker expression. Metaphase spreads of 
resistance cultures can be prepared, and the metaphase chromosomes 
examined by FISH analysis using specific probes in order to detect 
amplification of regions of the chromosomes. Cells that have artificial 

25 chromosomes with functioning centromeres or artificial chromosomal 
intermediate structures, including, but not limited to, dicentric 
chromosomes, formerly dicentric chromosomes, minichromosomes, 
heterochromatin structures (e.g. sausage chromosomes), and stable self- 
replicating artificial chromosomal intermediates as described herein, are 
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identified and cultured. In particular, the cells containing self-replicating 
artificial chromosomes are identified. 

The DNA introduced into a plant cell for the generation of PACs 
can be in any form, including in the form of a vector. An exemplary 
5 vector for use in methods of generating PACs can be prepared as follows. 

For the production of artificial chromosomes, plant transformation 
vectors, as exemplified by pAglla and pAgllb, containing a selectable 
marker, a targeting sequence, and a scorable marker were constructed 
using procedures well known in the art to combine the various fragments. 

10 The vectors can be prepared using vector pAg1 as a base vector and 
inserting the following DNA fragments into pAgl: DNA encoding /?- 
glucoronidase under the control of the nopaline synthase (NOS) promoter 
fragment and flanked at the 3' end by the NOS terminator fragment, a 
fragment of mouse satellite DNA and an N. tabacum rDNA intergenic 

15 spacer sequence (IGS). In constructing plant transformation vectors, 
vector pAg2 can also be used as the base vector. 
1. Construction of pAGI 

Vector pAgl (SEQ. ID. NO: 89) is a derivative of the CAMBIA 
vector named pCambia 3300 (Center for the Application of Molecular 

20 Biology to International Agriculture, i.e., CAMBIA, Canberra, Australia; 
www.cambia.org), which is a modified version of vector pCambia 130O 
to which has been added DNA from the bar gene confering resistance to 
phosphinothricin. The nucleotide sequence of pCambia 3300 is provided 
in SEO. ID. NO: 90. pCambia 3300 also contains a lacZ alpha sequence 

25 containing a polylinker region. 

pAgl was constructed by inserting two new functional DNA 
fragments into the polylinker of pCambia 3300: one sequence containing 
an attB site and a promoterless zeomycin resistance-encoding DNA 
flanked at the 3' end by a SV40 polyA signal sequence, and a second 



WO 02/097059 



PCT/US02/17452 



-133- 

sequence containing DNA from the hygromycin resistance gene 
(hygromycin phosphotransferase) confering resistance to hygromycin for 
selection in plants. Although the zeomycin-SV40 polyA signal fusion is 
not expected to function in plant cells, it can be activated in mammalian 
5 cells by insertion of a functional promoter element into the attB site by 
site-specific recombination catalyzed by the Lambda att integrase. Thus, 
the inclusion of the attB-zeomycin sequences allows for evaluation of 
functionality of plant artificial chromosomes in mammalian cells by 
activation of the zeomycin resistance-encoding DNA, and provides an att 

1 0 site for further insertion of new DNA sequences into plant artificial 

chromosomes formed as a result of using pAg1 for plant transformation . 
The second functional DNA fragment allows for selection of plant cells 
with hygromycin. Thus, pAg1 contains DNA from the bar gene confering 
resisance to phosphinothricin, DNA from the hygromycin resistance gene, 

1 5 both resistance-encoding DNAs under the control of a separate 

cauliflower mosaic virus (CaMV) 35S promoter, and the attB-promoterless 
zeomycin resistance-encoding DNA. 

pAg1 is a binary vector containing Agrobacterium right and left T- 
DNA border sequences for use in Agrobacterium-med'xaXed transformation 

20 of plant cells or protoplasts with the DNA located between the border 
sequences. pAg1 also contains the pBR322 Ori for replication in E.colL 
pAg1 was constructed by ligating Hmd\ I )/Pst\ -digested p3300attBZeo 
with /Y/V7omi/Psfl-digested pBSCaMV35SHyg as follows, 
a. Generation of p3300attBZeo 

25 Plasmid pCambia 3300 was digested with Psti/EcH 36 II and ligated 

with PsflASft/l-digested pLITattBZeo (the nucleotide sequence of 
pLITattBZeo is provided in SEQ. ID. NO: 91. (containing DNA encoding 
the zeocin resistance gene and an attB Integrase recognition sequence) to 
generate p3300attBZeo which contains an attB site, a promoterless 
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zeomycin resistance-encoding DNA flanked at the 3' end by a SV40 
polyA signal, and a reconstructed Pst\ site. 

b. Generation of pBSCaMV35SHyg 
A DNA fragment containing DNA encoding hygromycin 
5 phosphotransferase flanked by the CaMV 35S promoter and the CaMV 
35S polyA signal sequence was obtained by PGR amplification of plasmid 
pCambia 1302 (GenBank Accession No. AF234298 and SEQ. ID. NO: 
92). The primers used in the amplification reaction were as follows: 
CaMV35SpolyA: 

10 5'-CTGAATTAACGCCGAATTAATTCGGGGGATCTG-3' SEQ. ID. NO: 93 
CaMV35Spr: 

5'-CTAGAGCAGCTTGCCAACATGGTGGAGCA-3' SEQ. ID. NO: 94 
The 2100-bp PCR fragment was ligated with FcoRV-digested pBluescript 
II SK+ (Stratagene, La Jolla, CA, U.S.A.) to generate pBSCaMV35SHyg. 

15 c. Generation of pAgl 

To generate pAgl, pBSCaMV35SHyg was digested with 
Hind\\\/Pst\ and ligated with H/nd\)\/Pst\-d\gested p3300attBZeo. Thus, 
pAgl contains the pCambia 3300 backbone with DNA conferring 
resistance to phophinothricin and hygromycin under the control of 

20 separate CaMV 35S promoters, an attB-promoterless zeomycin 

resistance-encoding DNA recombination cassette and unique sites for 
adding additional markers, e.g., DNA encoding GFP. The attB site can be 
used as decribed hereing for the addition of new DNA sequences to plant 
artificial chromosomes, including PACs formed as a result of using the 

25 pAgl vector, or derivatives thereof, in the production of PACs. The attB 
site provides a convenient site for recombinase-mediated insertion of 
DNAs containing a homologous att site. 
2. pAG2 
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The vector pAg2 (SEQ. ID. NO: 95) is a derivative of vector pAg1 
formed by adding DNA encoding a green fluorescent protein (GFP), under 
the control of a NOS promoter and flanked at the 3' end by a NOS polyA 
signal, to pAgl. pAg2 was constructed as follows. A DNA fragment 
5 containing the NOS promoter was obtained by digestion of pGEM-T-NOS, 
or pGEMEasyNOS (SEQ. ID. NO: 96), containing the NOS promoter in the 
cloning vector pGEM-T-Easy (Promega Biotech, Madison, Wl, U.S.A.), 
with Xba\INco\ and was ligated to an XbaMNcoX fragment of pCambia 
1 302 containing DNA encoding GFP (without the CaMV 35S promoter) to 

10 generate p1302NOS (SEQ. ID. NO: 97) containing GFP-encoding DNA in 
operable association with the NOS promoter. Plasmid p1302NOS was 
digested with Sma\/Bst\N\ to yield a fragment containing the NOS 
promoter and GFP-encoding DNA. The fragment was ligated with 
P/7jel/#s/WI-digested pAgl to generate pAg2. Thus, pAg2 contains DNA 

1 5 from the bar gene confering resistance to phosphinothricin, DNA 

conferring resistance to hygromycin, both resistance-encoding DNAs 
under the control of a cauliflower mosaic virus 35S promoter, DNA 
encoding kanamycin resistance, a GFP gene under the control of a NOS 
promoter and the attB-zeomycin resistance-encoding DNA. One of skill in 

20 the art will appreciate that other fragments can be used to generate the 
pAgl and pAg2 derivatives and that other heterlogous DNA can be 
incorporated into pAgl and pAg2 derivatives using methods well known 
in the art. 

3. pAglla and pAgllb transformation vectors 
25 Vectors pAglla and pAgllb were constructed by inserting the 

following DNA fragments into pAgl : DNA encoding >ff-glucoronidase, the 
nopaline synthase terminator fragment, the nopaline synthase (NOS) 
promoter fragment, a fragment of mouse satellite DNA and an N. tabacum 
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rDNA intergenic spacer sequence (IGS). The construction of pAglla and 
pAgllb was as follows. 

An N. tabacum rDNA intergenic spacer (IGS) sequence (SEQ. ID. 
NO: 98; see also GenBank Accession No. Y08422; see also Borysyuk et 
5 a/. (2000) Nature Biotechnology 75:1303-1306; Borysyuk eta!. (1997) 
Plant MoL Bio/. 35:655-660; U.S. Patent Nos. 6,100,092 and 6,355,860) 
was obtained by PCR amplification of tobacco genomic DNA. The IGS 
can be used as a targeting sequence by virtue of its homology to tobacco 
rDNA genes; the sequence is also an amplification promoter sequence in 
10 plants. This fragment was amplified using standard PCR conditions (e.g., 
as described by Promega Biotech, Madison, Wl, U.S.A.) from tobacco 
genomic DNA using the primers shown below: 
NTIGS-FI 

5'- GTG CTA GCC AAT GTT TAA CAA GAT G- 3' (SEQ ID No. 99) and 
1 5 NTIGS-RI 

5'-ATG TCT TAA AAA AAA AAA CCC AAG TGA C- 3' (SEQ ID No. 100) 
Following amplification, the fragment was cloned into pGEM-T Easy to 
give pIGS-l A fragment of mouse satellite DNA (Msatl fragment; 
GenBank Accession No. V00846; and SEQ ID No. 101) was amplified via 
20 PCR from pSAT-1 using the following primers: 
MSAT-F1 

5'- AAT ACC GCG GAA GCT TGA CCT GGA ATA TCG C -3'(SEQ ID No. 

102) and 

MSAT-Ri 

25 5'-ATA ACC GCG GAG TCC TTC AGT GTG CA T- 3' (SEQ ID No. 103) 

This amplification added a SacU and a Hind\\\ site at the 5'end and a SacW 
site at the 3' end of the PCR fragment. This fragment was then cloned 
into the SacW site in plGS-1 to give pMIGS-1, providing a eukaryotic 
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centromere-specific DNA and a convenient DNA sequence for detection 
via FISH. 

A functional marker gene containing a NOS-promoter:GUS:NOS 
terminator fusion was then constructed containing the NOS promoter 
5 (GenBank Accession No. U09365; SEQ ID No. 104), E. coll 

/^-glucuronidase coding sequence (from the GUS gene; GenBank 
Accession No. S69414; and SEQ ID No. 105), and the nopaline synthase 
terminator sequence (GenBank Accession No. U09365; SEQ ID No. 107). 
The NOS promoter in pGEM-T-NOS was added to a promoterless GUS 

10 gene in pBlueScript (Stratagene, La Jolla, CA, U.S.A.) using Not\ISpe\ to 
form pNGN-1, which has the NOS promoter in the opposite orientation 
relative to the GUS gene. 

pMIGS-1 was digested with Not\ISpe\ to yield a fragment 
containing the mouse major satellite DNA and the tobacco IGS which was 

15 then added to A/ofl-digested pNGN-1 to yield pNGN-2. The NOS promoter 
was then re-oriented to provide a functional GUS gene, yielding pNGN-3, 
by digestion and religation with Spel. Plasmid pNGN-3 was then digested 
with Hlnd\\\, and the Hlnd\\\ fragment containing the /^-glucuronidase 
coding sequence and the rDNA intergenic spacer, along with the Msat 

20 sequence, was added to pAG-1 to form pAglla (SEQ ID NO: 108), using 
the unique Hind\\\ site in pAg1 located near the right T-DNA border of 
pAg1 , within the T-DNA region. 

Another plasmid vector, referred to as pAgllb, was also recovered, 
which contained the inserted Hindlll fragment (SEQ ID NO: 108) in the 

25 opposite orientation relative to that observed in pAglla. Thus, pAglla and 
pAgllb differ only in the orientation of the Hindlll fragment containing the 
mouse major satellite sequence, the GUS DNA sequence and the IGS 
sequence. The nucleotide sequences of pAglla is provided in SEQ. ID. 
NOS: 109. 



WO 02/097059 



PCT/US02/17452 



-138- 

Since modifications will be apparent to those of skill in this art, it is 
intended that this invention be limited only by the scope of the appended 
claims. 
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WHAT IS CLAIMED IS: 

1 , A eukaryotic chromosome comprising one or a plurality of att 
site(s), wherein: 

an att site is heterologous to the chromosome; and 
5 an att site permits site-directed integration in the presence of 

lambda integrase. 

2. The eukaryotic chromosome of claim 1 , wherein the att sites 
are selected from the group consisting of attP and attB or attL and attR, 
or variants thereof. 

10 3. The eukaryotic chromosome of claim 1 that is an artificial 

chromosome. 

4. The eukaryotic chromosome of claim 1 that is an artificial 
chromosome expression system (ACes). 

5. The eukaryotic chromosome of claim 4 that is predominantly 
1 5 heterochromatin . 

6. The chromosome of claim 1 that is an artificial chromosome 
that contains no more than about 30%, 40%, 50%, 60%, 70%, 80%, 
90% or 95% euchromatin. 

7. The chromosome of claim 1 that is a plant chromosome. 
20 8. The chromosome of claim 1 that is an animal chromosome. 

9. The chromosome of claim 7 that is a plant artificial 
chromosome. 

10. The chromosome of claim 8 that is an animal artificial 
chromosome. 

25 1 1 - The chromosome of claim 8 that is a mammalian 

chromosome. 

12. The chromosome of claim 1 1 that is a mammalian artificial 
chromosome. 
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13. The chromosome of claim 6 that is an artificial chromosome 
expression system (ACes). 

14. A platform artificial chromosome expression system (ACes) 
comprising one or a plurality of sites that participate in recombinase 

5 catalyzed recombination. 

1 5. The ACes of claim 14 that contains one site. 
1 6. The ACes of claim 14 that is predominantly heterochromatin. 
1 7- The ACes of claim 14 that contains no more than about 
30%, 40%, 50%, 60%, 70%, 80%, 90% or 95% euchromatin. 
10 18. The ACes of claim 14 that is a plant ACes. 

1 9. The ACes of claim 1 4 that is an animal ACes. 

20. The ACes of claim 14 that is selected from a fish, insect, 
reptile, amphibian, arachnid or a mammalian ACes. 

21 . The ACes of claim 14 that is a fish ACes. 

15 22. The artificial chromosome expression system (ACes) of claim 

14, wherein the recombinase and site(s) are from the Cre/lox system of 
bacteriophage P1 , the int/att system of lambda phage, the FLP/FRT 
system of yeast, the Gin/gix recombinase system of phage Mu, the Cin 
recombinase system, the Pin recombinase system of E. coli and the R/RS 

20 system of the pSRI plasmid, or any combination thereof. 

23. A method of introducing heterologous nucleic acid into a 
chromosome, comprising: 

contacting a chromosome of any of claims 1 or 14 with a nucleic 
acid molecule comprising both the heterologous nucleic acid and a 

25 recombination site, in the presence of a recombinase that promotes 

recombination between the sites in the chromosome and in the nucleic 
acid molecule. 
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24. The method of claim 23, wherein the recombinase is 
selected from the group consisting of Cre, Gin, Cin, Pin, FLP, a phage 
integrase and R from the pSR1 plasmid. 

25. The method of claim 23, wherein the nucleic acid molecule 
5 encodes a therapeutic protein, antisense nucleic acid, or comprises an 

artificial chromosome. 

26. The method of claim 25, wherein the nucleic acid molecule 
comprises a yeast artificial chromosomes (YAC), a bacterial artificial 
chromosome (BAC) or an insect artificial chromosome (I AC). 

lO 27. A combination, comprising, the chromosome of claim 1 and a 

first vector comprising the cognate recombination site, wherein the 
cognate recombination site is a site that recombines with the site 
engineered into the chromosome. 

28. The combination of claim 27, further comprising nucleic acid 
1 5 encoding a recombinase, wherein the nucleic acid is on a second vector 

or on the first vector, or on the ACes under an inducible promoter. 

29. The combination of claim 28, wherein the recombinase and 
sites are from the Cre/lox system of bacteriophage P1 , the int/att system 
of lambda phage, the FLP/FRT system of yeast, the Gin/gix recombinase 

20 system of phage Mu, the Pin recombinase system of E. coli and the R/RS 
system of the pSR1 plasmid, or any combination thereof. 

30. The combination of claim 28, wherein a vector is the plasmid 
pCXLamlntR. 

31. The combination of claim 27, wherein a vector is the plasmid 
25 pDsRedNI-attB. 

32. A kit, comprising the combination of claim 27 and optionally 
instructions for introducing heterologous nucleic acid into the 
chromosome. 
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33. A method for introducing heterologous nucleic acid into a 
platform artificial chromosome, comprising: 

(a) mixing an artificial chromosome comprising at least a first 
recombination site and a vector comprising at least a second 

5 recombination site and the heterologous nucleic acid; 

(b) incubating the resulting mixture in the presence of at least one 
recombination protein under conditions whereby recombination between 
the first and second recombination sites is effected, thereby introducing 
the heterologous nucleic acid into the artificial chromosome. 

1 0 34. The method of claim 33, wherein the artificial chromosome is 

an ACes. 

35. The method of claim 33, wherein said mixing step (a) is 
conducted in cells ex vivo. 

36. The method of claim 33, wherein said mixing step (a) is 
1 5 conducted extracellularly in an in vitro reaction mixture. 

37. The method of claim 33, wherein the at least one 
recombination protein is encoded by a bacteriophage selected from the 
group consisting of bacteriophage lambda, phi 80, P22, P2, 186, P4 and 
PI. 

20 38. The method of claim 37, wherein the at least one 

recombination protein is encoded by bacteriophage lambda, or mutants 
thereof. 

39. The method of claim 33, wherein at least one recombination 
protein is selected from the group consisting of Int, IHF, Xis and Cre, yS, 

25 Tn3 resolvase, Hin, Gin, Cin and Flp. 

40. The method of claim 32, wherein the recombination sites are 
selected from the group consisting of att and lox P sites. 
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41 . The method of claim 33, wherein the first and/or second 
recombination site contains at least one mutation that removes one or 
more stop codons. 

42. The method of claim 33, wherein the first and/or second 
5 recombination site contains at least one mutation that avoids hairpin 

formation. 

43. The method of claim 33, wherein the first and/or second 
recombination site comprises at least a first nucleic acid sequence 
selected from the group consisting of SEQ ID NOs:41-56: 

10 a) RKYCWGCTTTYKTRTACNAASTSGB (m att) (SEQ ID NO:41); 

b) AGCCWGCTTTYKTRTACNAACTSGB (m-attB) (SEQ ID NO:42); 

c) GTTC AGCTTTCKTRTACN AACTSG B (m-attR) (SEQ ID NO:43); 

d) AGCCWGCTTTCKTRTACNAAGTSGB (m-attL) (SEQ ID NO:44); 

e) GTTCAGCTTTYKTRTACN AAGTSG B (m-attP1) (SEQ ID NO:45); 
15 f) AGCCTGCTTTTTTGTACAAACTTGT (attBD (SEQ ID NO:46); 

g) AGCCTGCTTTCTTGTACAAACTTGT (attB2) (SEQ ID NO:47); 

h) ACCCAGCTTTCTTGTACAAACTTGT (attB3) (SEQ ID NO:48); 

i) GTTCAGCTTTTTTGTACAAACTTGT (attRI) (SEQ ID NO:49); 
j) GTTCAGCTTTCTTGTACAAACTTGT (attR2) (SEQ ID NO:50); 

20 k) GTTCAGCTTTCTTGTACAAAGTTGG (attR3) (SEQ ID NO:51); 

I) AGCCTGCTTTTTTGTACAAAGTTGG (attL1 ) (SEQ ID NO:52); 

m) AGCCTGCTTTCTTGTACAAAGTTGG (attL2) (SEQ ID NO:53); 

n) ACCCAGCTTTCTTGTACAAAGTTGG (attL3) (SEQ ID NO:54); 

o) GTTC AG CTTTTTTGTAC AA AGTTG G (attPD (SEQ ID NO:55); 
25 p) GTTCAGCTTTCTTGTACAAAGTTGG (attP2, P3) (SEQ ID NO: 

56); 

and a corresponding or complementary DNA or RNA sequence, 
wherein R = A or G, K = G or T/U, Y = C or T/U, W = A or T/U, N = A or C 
or G or T/U, S = C or G, and B = C or G or T/U; and 
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the core region does not contain a stop codon in one or more 
reading frames. 

44. The method of claim 33, wherein the first and/or second 
recombination site comprises at least a first nucleic acid sequence 

5 selected from the group consisting of a mutated att recombination site 
containing at least one mutation that enhances recombinational 
specificity, a complementary DNA sequence thereto, and an RNA 
sequence corresponding thereto . 

45. The method of claim 33, wherein the vector comprising the 
10 second site further encodes at least one selectable marker. 

46. The method of claim 45, wherein the marker is a 
promoterless marker, which, upon recombination is under the control of a 
promoter and is thereby expressed. 

47. The method of claim 46, wherein the first recombination site 
15 is attP and is in the sense orientation prior to recombination. 

48. The method of claim 46, wherein the selectable marker is 
selected from the group consisting of an antibiotic resistance gene, and a 
detectable protein, wherein the detectable protein is chromogenic, 
fluorescent, or capable of being bound by an antibody and FACs sorted. 

20 49. The method of claim 48, wherein the selectable marker is 

selected from the group consisting of green fluorescent protein (GFP), red 
fluorescent protein (RFP), blue fluorescent protein (BFP), and E. coli 
histidinol dehydrogenase (hisD). 

50. A cell comprising, the chromosome of claim 1 . 

25 51. The cell of claim 50, wherein the cell is a nuclear donor cell. 

52. The cell of claim 50, wherein the cell is a stem cell. 

53. The stem cell of claim 52, wherein said stem cell is human 
and is selected from the group consisting of a mesenchymal stem cell, a 
hematopoietic stem cell, an adult stem cell and an embryonic stem cell. 
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54. The cell of claim 50, wherein the cell is mammalian. 

55. The cell of claim 54, wherein the mammal is selected from 
the group consisting of humans, primates, cattle, pigs, rabbits, goats, 
sheep, mice, rats, guinea pigs, hamsters, cats, dogs, and horses. 

5 56. The cell of claim 50, wherein the cell is a plant cell. 

57. A cell comprising the platform ACss of claim 14. 

58. The cell of claim 57, wherein the cell is a nuclear donor cell. 

59. The cell of claim 57, wherein the cell is a stem cell. 

60. The stem cell of claim 59, wherein said stem cell is human 
10 and is selected from the group consisting of a mesenchymal stem cell, a 

hematopoietic stem cell, an adult stem cell and an embryonic stem cell. 

61. A human mesenchymal cell comprising an artificial 
chromosome. 

62. The human mesenchymal eel! of claim 61, wherein said 
15 artificial chromosome is an ACes. 

63. The human mesenchymal cell of claim 62, wherein the ACes 
is a platform-v4Ces. 

64. A method for introducing heterologous nucleic acid into the 
mesenchymal cell of claim 63, comprising: 

20 (a) introducing into the cell of claim 63, wherein the platform-/4Ces 

has a first recombination site, a vector comprising at least a second 

recombination site and the heterologous nucleic acid; 

(b) incubating the resulting mixture in the presence of at least one 

recombination protein under conditions whereby recombination between 
25 the first and second recombination sites is effected, thereby introducing 

the heterologous nucleic acid into the platform-/! Ces within the 

mesenchymal cell. 

65. A lambda-intR mutein comprising a glutamic acid to arginine 
change at position 1 74 of wild-type lambda-intR. 
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66. The lambda-intR mutein of claim 65, wherein the iambda-intR 
mutein comprises SEQ ID NO:37. 

67. The method of claim 46 wherein the promoterless marker is 
transcriptionally downstream of the heterologous nucleic acid, wherein 

5 the heterologous nucleic acid encodes a heterologous protein, and 

wherein the expression level of the selectable marker is transcriptionally 
linked to the expression level of the heterologous protein. 

68. The method of claim 67, wherein the selectable marker and 
the heterologous nucleic acid are transcriptionally linked by the presence 

10 of a IRES between them. 

69. The method of claim 68, wherein the selectable marker is 
selected from the group consisting of an antibiotic resistance gene, and a 
detectable protein, wherein the detectable protein is chromogenic or 
fluorescent. 

1 5 70. The method of claim 69, wherein the selectable marker is 

selected from the group consisting of green fluorescent protein (GFP), red 
fluorescent protein (RFP), blue fluorescent protein (BFP), and E. coli 
histidinol dehydrogenase. 

71 . The method of claim 67 further comprising expressing the 
20 heterologous protein and isolating the heterologous protein. 

72. A method for producing a transgenic animal, comprising 
introducing a platform-v4Ces into an embryonic cell. 

73. The method of claim 72, wherein the embryonic cell is a 
stem cell. 

25 74. The method of claim 72, wherein the embryonic cell is in an 

embryo. 

75. The method of claim 72, wherein the platform-^ Ces 
comprises heterologous nucleic acid that encodes a therapeutic product. 
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76. The method of claim 72, wherein the transgenic animal is a 
fish, insect, reptile, amphibians, arachnid or mammal. 

77. The method of claim 72, wherein the A Ces is introduced by 
cell fusion, lipid-mediated transfection by a carrier system, microinjection, 

5 microcell fusion, electroporation, microprojectile bombardment or direct 
DNA transfer. 

78. A transgenic animal produced by the method of claim 72. 

79. A cell line useful for making a library of ACes, comprising a 
multiplicity of heterologous recombination sites randomly integrated 

10 throughout the endogenous chromosomes. 

80. A method of making a library of ACes comprising random 
portions of a genome, comprising introducing one or more A Ces into the 
cell line of claim 79, under conditions that promote the site-specific 
chromosomal arm exchange of the ACes into, and out of, a multiplicity of 

15 the heterologous recombination sites within the cell's chromosomal DNA; 
and isolating said multiplicity of ACes, thereby producing a library of 
ACes whereby multiple ACes have different portions of the genome 
within. 

81. A library of cells useful for genomic screening, said library 
20 comprising a multiplicity of cells, wherein each cell comprises an ACes 

having a mutually exclusive portion of a chromosomal nucleic acid 
therein. 

82. The library of cells of claim 81, wherein the cells of the 
library are from a different species than the chromosomal nucleic acid 

25 within the ACes. 

83. A method of making one or more cell lines, comprising 

a) integrating into endogenous chromosomal DNA of a selected cell 
species, a multiplicity of heterologous recombination sites, 
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b) introducing a multiplicity of ACes under conditions that promote 
the site-specific chromosomal arm exchange of the ACes into, and out of, 
a multiplicity of the heterologous recombination sites integrated within the 
cell's endogenous chromosomal DNA; 
5 c) isolating said multiplicity of ACes, thereby producing a library of 

ACes whereby a multiplicity of ACes have mutually exclusive portions of 
the endogenous chromosomal DNA therein; 

d) introducing the isolated multiplicity of ACes of step c) into a 
multiplicity of cells, thereby creating a library of cells; 
10 e) selecting different cells having mutually exclusive ACes therein 

and clonally expanding or differentiating said different cells into clonal cell 
cultures, thereby creating one or more cell lines. 

84-. The method of claim 23, wherein the nucleic acid molecule 
with a recombination site is a PCR product. 
15 85. Method of claim 23 wherein the recombinase is a protein and 

the recombination event occurs in vitro. 

86. The method of claim 33, wherein the vector is a PCR 
product comprising a second recombination site. 

87. The lambda-intR mutein of claim 65, wherein the mutein 
20 further comprises an amino acid signal for nuclear localization. 

88. The lambda-intR mutein of claim 65, wherein the mutein 
further comprises an epitope tag for protein purification. 

89. A modified iron-induced promoter comprising SEQ ID 
NO:128. 

25 90. A plasmid or expression cassette comprising the promoter of 

claim 89. 

91. A vector, comprising: 

a recognition site for recombination; and 
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a sequence of nucleotides that targets the vector to an 
amplifiable region of a chromosome. 

92. The vector of claim 91, wherein the amplifiable region 
comprises heterochromatic nucleic acid. 
5 93. The vector of claim 91, wherein the amplifiable region 

comprises rDNA. 

94. The vector of claim 93, wherein the rDNA comprises an 
intergenic spacer. 

95. The vector of claim 91, further comprising nucleic acid 
10 encoding a selectable marker that is not operably associated with any 

promoter. 

96. The vector of claim 91, wherein the chromosome is a 
mammalian chromosome. 

97. The vector of claim 91, wherein the chromosome is a plant 
1 5 chromosome. 

98. A cell of claim 57 that is a plant cell, wherein the ACes 
platform is a MAC. 

99. The plant cell of claim 98, wherein the MAC comprises 
transcriptional regulatory sequence of nucleotides derived from plants. 

20 lOO. The plant cell of claim 99, wherein the regulatory sequence 

* is selected from the group consisting of promoters, terminators, 
enhancers, silencers and transcription factor binding sites. 

101 . A cell of claim 57 that is an animal cell, wherein the ACes 
platform is a plant artificial chromosome (PAC). 
25 102. The cell of claim 101 that is a mammalian cell. 

103. The cell of claim 98, wherein the MAC comprises 
transcriptional regulatory sequence of nucleotides derived from plants. 

104. The cell of claim 102, wherein the MAC comprises 
transcriptional regulatory sequence of nucleotides derived from plants. 
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105, The cell of claim 104, wherein the regulatory sequence is 
selected from the group consisting of promoters, terminators, enhancers, 
silencers and transcription factor binding sites. 

106. A method, comprising: 



growing the cells; and 

selecting a cell comprising an artificial chromosome that comprises 
one or more repeat regions. 

107. The method of claim 106, wherein sufficient portion of the 
10 vector integrates into a chromosome in the cell to result in amplification 

of chromosomal DNA. 

108. The method of claim 106, wherein the artificial chromsome 
is an ACes. 

109. A method for screening, comprising: 

15 contacting a cell comprising a reporter ACes with test compounds 

or known compounds, wherein: 

the reporter ACes comprises one or a plurality of reporter 
constructs; 

a reporter construct comprises a reporter gene in operative linkage 
20 with a regulatory region responsive to test or known compounds; and 

detecting any increase or decrease in signal output from the 
reporter, wherein a change in the signal is indicative of activity of the test 
or known compound on the regulatory region. 

1 TO. The method of claim 109, wherein the reporter is bperatively 
25 linked to a promoter that controls expression of a gene in a signal 
transduction pathway, whereby activation or reduction in the signal 
indicates that the pathway is activated or down-regulated by the test 
compound. 



5 



introducing a vector of claim 91 into a cell; 
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111. The method of claim 109, wherein the reporter in the 
construct encodes drug resistance or encodes a fluorescent protein. 

112. The method of claim 111, wherein the fluorescent protein is 
selected from the group consisting of red, green and blue fluorescent 

5 proteins. 

113. The method of claim 109, wherein the ACes comprises a 
plurality of reporter-linked constructs, each with a different reporter, 
whereby the pathway(s) affected by the test compounds can be 
elucidated. 

10 114. The method of claim 109, wherein a reporter is operatively 

linked to a promoter that is transcriptionally regulated in resopnse to DNA 

damage, and the test compounds are genotoxicants. 

115. The method of claim 1 14, wherein the DNA damage is 

induced by apoptosis, necrosis or cell-cycle perturbations. 
15 116. The method of claim 114, wherein unknown compounds are 

screened to assess whether they are genotoxicants. 

117. The method of claim 114, wherein the promoter is a 
cytochrome P450-profiled promoter. 

118. The method of claim 114, wherein the cell is in a transgenic 
20 animal and toxicity is assessed in the animal. 

119. The method of claim 109, wherein: 

the cell is a patient cell sample; the patient has a disease; 

the regulatory region-is one targeted by a drug or drug regimen; 

and 

25 the method assesses the effectiveness of a treatment for the 

disease for the particular patient. 

120. The method of claim 119, wherein the cell is a tumor cell. 
121 . The method of claim 109, wherein the cell is a stem cell or a 

progenitor cell, whereby expression of the reporter is operatively linked to 
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a regulatory region exprssed in the ceils to thereby identify stem cells or 
progenitor cell. 

122. The method of claim 109, wherein the cell is in an animal; 
and the method comprises whole-body imaging to monitor expression of 
5 the reporter in the animal. 

1 23. A reporter ACes comprises one or a plurality of reporter 
constructs, wherein the reporter construct comprises a reporter gene in 
operative linkage with a regulatory region responsive to test or known 
compounds. 



WO 02/097059 



PCT/US02/17452 



3/15 



CO 

a 
o 

■■a 

• 1— I 

"I 
o 
o 

CD 
<D 

.S 

o 

CO 

1 
J 



Ph 

CO 
Ph 



Q 

& 

o 

I 



I 

CO 

P* 
O 

6 

i 

C-> CO 

o a 
3 -2 

g -55 



.3 

o 
p 



§ 

o 

1/3 

o 

e 

2 

-S § 

p « 
o .5 

.S S 
a- c 

i « 

s s 

•9 S? 
§5.g 



S3 



1 

CO 

J- 

I 



I 

CO 



ft 





▼ 




WO 02/097059 



6/15 



PCTYUS02/17452 




WO 02/097059 



8/15 



PCT/US02/17452 




WO 02/097059 



10/15 



PCT/US02/17452 




WO 02/097059 



11/15 



PCT/US02/17452 




WO 02/097059 



12/15 



PCT7US02/17452 




WO 02/097059 



15/15 



PCT/US02/17452 






WO 02/097059 



PCT7US02/17452 



-1- 



SEQUENCE LISTING 



<110> CHROMOS MOLECULAR SYSTEMS, INC. 
PerJcins , Edward 
Perez, Carl 
Lindenbaum, Michael 
Greene, Amy 
Leung, Josephine 
Fleming, Elena 
Stewart, Sandra 
Shellard, Joan 



<12 0> CHROMOSOME -BASED PLATFORMS 

<130> 24601-420PC 

<140> Not Yet Assigned 
<141> Herewith 

<150> 60/294,758 
<151> 2001-05-30 

<150> 60/366,891 
<1S1> 2002-03-21 

<160> 129 

<170> FastSEQ for Windows Version 4.0 

<210> 1 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer: at t PUP 
<400> 1 

ccttgcgcta atgctctgtt acagg 25 



<211> 26 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer: attPDWN 

<400> 2 

cagaggcagg gagtgggaca aaattg 26 

<210> 3 

<211> 35 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer: Lamint 1 



<210> 2 



<400> 3 

ttcgaattca tgggaagaag gcgaagtcat gagcg 



35 



<210> '4 
<211> 34 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Primer: Lamint 2 
<400> 4 

ttcgaattct tatttgattt caattttgtc ccac 34 

<210> 5 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 
<400> 5 

cggacaatgc ggttgtgcgt 2 0 

<210> 6 
<211> 46 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> primer 
<400> 6 

cgcgcagcaa aatctagagt aaggagatca agacttacgg ctgacg 4 6 

<210> 7 
<211> 46 
<212> DNA 

<213> Artificial Sequence 
<220> 

< 2 2 3 > LambdaINTER174rev 
<400> 7 

cgtcagccgt aagtcttgat ctccttactc tagattttgc tgcgcg 46 

<210> 8 

<211> 33 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attBl 



<210> 9 
<211> 33 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attB2 
<400> 9 

ttcgctcaag ttagtataaa aaagcaggct tea 3 3 

<210> 10 

<211> 25 

<212> DNA 

<213> Artificial Sequence 



<400> 8 

tgaagcctgc ttttttatac taacttgagc gaa 



33 



<220> 
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<223> Primer: attPdwn2 



<400> 10 

tcttctcggg cataagtcgg acacc 



25 



<210> 11 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer : CMVen 
<400> 11 

ctcacgggga tttccaagtc tccac 2 5 

<210> 12 
<211> 26 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer : at tPdvm 
<400> 12 

cagaggcagg gagtgggaca aaattg 26 

<210> 13 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<2 23> Primer : CMVEN2 
<400> 13 

caactccgcc ccat tgacgc aaatg 2 5 

<210> 14 
<21l> 26 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer: LI 
<400> 14 

agtatcgccg aacgattagc tcttca 2 6 

<210> 15 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer : Fl rev 
<400> 15 

gccgatttcg gcctattggt taaa 24 

<210> 16 
<211> 25 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Primer : RED 



WO 02/097059 



PCTVUS02/17452 



<400> 16 

ccgccgacat ccccgactac aagaa 

<210> 17 

<211> 25 

<212> DNA 

<213> Artificial Sequence 
<220:> 

<=22 3> Primer : L2 rev 



25 



<400> 17 

ttccttcgaa ggggatccgc 

<210> 18 

<211> 22118 

<212> DNA 

<213> Mus mu s cuius 

<300> 

<308> GenBank X82564 
<309> 1996-04-09 



ctacc 



25 



<400> 18 

gaattcccct 

aaaaccctgt 

ctgagtgata 

cagacagaca 

acaccactct 

tctgtctgtc 

cctgcctgcc 

cctcctaagt 

tccttccttc 

ttctttcctt 

tgtcttgaag 

ctggcatgaa 

aggagttcca 

atttcaccaa 

aaagtaggag 

tcaccattct 

acctggaaac 

gttagcagac 

c aaccgagt c 

cagagaaacc 

ttaaaaatag 

gggaggattt 

gctatacaga 

aatataaaat 

gagatggcaa 

tgtcataaaa 

gaatcatatg 

agatagggtt 

ggtagcctca 

ctgcctgcct 

tttatttctt 

ctttctttct 

tgcctatagg 

tcctgagaat 

atatgccgag 

tgtcttttat 

agaccaggct 

aaaggcatgt 

ctttctttct 

tttctttttt 

aattgcctca 

cagtatgtat 

aaattcatgt 



atccctaatc 
aggatcttca 
ggtcctggga 
gacagacgtt 
ggccataatt 
tgtctgtctg 
tgcctgcctg 
ttgccttttt 
cttccttcct 
cttacattta 
acactttgta 
tgttgtacct 
agaagactgg 
aagaatttag 
aaaaacgtga 
gcacttgcaa 
aataggtcac 
aagatggctg 
acagaacaag 
acatcttgaa 
ccgggagtga 
ctgagtttga 
gaaaccctgt 
aaaaatttta 
gtaactgcaa 
tccaatgtgc 
tctgaaaata 
tctctcagtg 
aactcagaga 
gcctgcctca 
tctctttctc 
ttcttattca 
cctgcttgcc 
aagtgaaaaa 
gctgtcagag 
ccaaacacag 
ggccttgaac 
gccaccactg 
ttttctctct 
cttttttttt 
gctctgctct 
gtatgtatat 
cattcttgtt 



cagattggtg 
ctctaggtca 
catatgcagt 
acaaacaaac 
attgaggacg 
tctgtctgtc 
cctacacaga 
tctctttctt 
tccttccttt 
ttcttttcat 
ggcctcaatc 
cactatgacc 
ttatattttt 
actgaccaat 
ggctgtctgt 
accgggccac 
atgaaggcca 
ccatgcacat 
gaagtataca 
aaaaacaaaa 
tggcgcatgt 
ggccagcctg 
cttgaaaact 
aagaatttta 
t c at agcaga 
cttcatgatg 
aaagccagaa 
tatccctggc 
ggtcctctct 
cttcttctgc 
tcttctttct 
attagttttc 
aggagagggc 
acaacaaaaa 
tgctttttaa 
aagagaggtg 
acattaatct 
cccggactga 
ctctttcttc 
ttttttttaa 
aattctcttt 
ttagaagaaa 
ccacaaagtg 



gaataacttg 
ctgttcagca 
tctgcacaga 
acgttgagcc 
ttgatttatt 
tatcaaacca 
gaaatgattt 
tatctttttc 
ctttctttct 
acatagtttc 
ctgtaagagc 
agcttagtct 
catttattat 
tcagagtctg 
gga tgg t cga 
tagaacccgg 
gccacctcca 
gttgtctttc 
cagtgagttc 
aaataaatta 
ctttaatccc 
gtctgcaaag 
aaactaaatt 
aaaaactaca 
aatattatac 
atcaaatttc 
ccttttctgc 
atccctgcct 
gcctgcctgc 
cacccacaca 
ttctttcttt 
aatgtaagtg 
aacagaacct 
aaggaaattc 
ggcttagtgt 
gctcggcctg 
gtctgcctct 
tttcttcttt 
cttccttcct 
aatttgccta 
aaaaaaaaac 
tactaatcca 
agttccagga 



gtatagatgt 
ctggaacctg 
cagacagaca 
gtgtgccaac 
attctgtgtt 
aaagaaacca 
cttcaatcaa 
ttttttcttt 
ttctttcttt 
ttagtgtaag 
cttcctctgc 
tcaagtctga 
tgcattttaa 
ccgtttaaaa 
ggctgcttta 
tgaagggaga 
tcttgttgtg 
agcttggtga 
caggt c ag c c 
aataaatata 
agctctcttc 
tgagttccag 
aaactaaact 
gaaatcaaac 
acacacacac 
gatagtcagt 
ttttgttttc 
ggaacttcct 
ctgcctgcct 
accgagtcga 
ctttctttct 
tgtgtttgtg 
aggagaaacc 
taatcacata 
aagtaatgaa 
catgtctgtt 
gcttccctaa 
tttttttttt 
ttctttctat 
aggttaaagg 
aaacaaaaaa 
ttaataactc 
cttaccagag 



ttgtgcatta 
aattgtggcc 
gacagacaga 
acacacacaa 
tgtgagtctg 
aacaattatg 
tctaaaacga 
tcttcttcct 
cttactttct 
catccctgac 
ttttcaaatg 
gttactggaa 
ttaaaattta 
gcataaggaa 
gggagcctcg 
aaccaaagcg 
cgggagttca 
ggtcaaagta 
agagtttaca 
atttaaaaat 
aggcagagat 
gacagtcagg 
aaactaaaaa 
ataagcccac 
acacagactc 
aatactagaa 
ttttgcccca 
ttgtaggttt 
gcctgcctgc 
acctaggatc 
ttctttcttt 
ctctatctgc 
accatgcagc 
gaatgtagat 
aattgttgtg 
gtctgcatgt 
tgctgcgatt 
tggaaaatac 
tctttttttc 
tgtgctccac 
aaaaccaaaa 
ttttttccta 
aaaccctgtg 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
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ttcaaatttc tgtgttcaag gtcaccctgg cttacaaagt gagttccaag tccgataggg 2 64 0 

ctacacagaa aaaccatatc tcagaaaaaa aaaaagttcc aaacacacac acacacacac 2 700 

acacacacac acacacacac acacacacac acacacacag cgcgccgcgg cgatgagggg 2760 

aagfccgtgcc taaaataaat atttttctgg ccaaagtgaa agcaaatcac tatgaagagg 2 820 

tactcctaga aaaaataaat acaaacgggc tttttaatca ttccagcact gttttaattt 2 880 

aactctgaat ttagtcttgg aaaagggggc gggtgtgggt gagtgagggc gagcgagcag 2 94 0 

acgggcgggc gggcgggtga gtggccggcg gcggtggcag cgagcaccag aaaacaacaa 3 00 0 

accccaagcg gtagagtgtt ttaaaaatga gacctaaatg fcggtggaacg gaggtcgccg 3 060 

ccaccctcct cttccactgc ttagatgctc ccttcccctt actgtgctcc cttcccctaa 312 0 

ctgtgcctaa ctgtgcctgt tccctcaccc cgctgattcg ccagcgacgt actttgactt 3180 

caagaacgat tttgcctgtt ttcaccgctc cctgtcatac tttcgttttt gggtgcccga 3240 

gtctagcccg ttcgctatgt tcgggcggga cgatggggac cgtttgtgcc actcgggaga 33 00 

agtggtgggt gggtacgctg ctccgtcgtg cgtgcgtgag tgccggaacc tgagctcggg 3360 

agaccctccg gagagacaga atgagtgagt gaatgtggcg gcgcgtgacg gatctgtatt 342 0 

ggtttgtatg gttgatcgag accattgtcg ggcgacacct agtggtgaca agtttcggga 34 8 0 

acgctccagg cctctcaggt tggtgacaca ggagagggaa gtgcctgtgg tgaggcgacc 3 54 0 

agggtgacag gaggccgggc aagcaggcgg gagcgtctcg gagatggtgt cgtgtttaag 3 600 

gacggtctct aacaaggagg tcgtacaggg agatggccaa agcagaccga gttgctgtac 3 66 0 

gcccttttgg gaaaaatgct agggttggtg gcaacgttac taggtcgacc agaaggctta 3 72 0 

agtcctaccc ccccccccct tttttttttt tttcctccag aagccctctc ttgtccccgt 3780 

caccgggggc accgtacatc tgaggccgag aggacgcgat gggcccggct tccaagccgg 3 84 0 

tgtggctcgg ccagctggcg cttcgggtct tttttttttt tttttttttt ttttcctcca 3900 

gaagccttgt ctgtcgctgt caccgggggc gctgtacttc tgaggccgag aggacgcgat 3 960 

gggccccggc ttccaagccg gtgtggctcg gccagctgga gcttcgggtc tttttttttt 4 02 0 

tttttttttt tttttttctc cagaagcctt gtctgtcgct gtcaccgggg gcgctgtact 4 08 0 

tctgaggccg agaggacgcg atgggtcggc ttccaagccg atgtggcggg gccagctgga 414 0 

gcttcgggtt tttttttttc ctccagaagc cctctcttgt ccccgtcacc gggggcgctg 420 0 

tacttctgag gccgagagga cgtgatgggc ccgggttcca ggcggatgtc gcccggtcag 4 260 

ctggagcttt ggatcttttt tttttttttt cctccagaag ccctctcttg tccccgtcac 4320 

cgggggcacc ttacatctga gggcgagagg acgtgatggg tccggcttcc aagccgatgt 43 80 

ggcggggcca gctggagctt cgggtttttt ttttttcctc cagaagccct ctcttgtccc 4440 

cgtcaccggg ggcgctgtac ttctgaggcc gagaggacgt gatgggcccg ggttccaggc 4500 

ggatgtcgcc cggtcagctg gagctttgga tcattttttt ttttccctcc agaagccctc 456 0 

tcttgtcccc gtcaccgggg gcaccgtaca tctgaggccg agaggacacg atgggcctgt 462 0 

cttccaagcc gatgtggccc ggccagctgg agcttcgggt cttttttttt ttttttcctc 4680 

cagaagcctt gtctgtcgct gtcacccggg gcgctgtact tctgaggccg agaggacgcg 4 740 

atgggcccgg cttccaagcc ggtgtggctc ggccagctgg agcttcgggt cttttttttt 4 80 0 

tttttttttt ttcctccaga aaccttgtct gtcgctgtca cccggggcgc ttgtacttct 4860 

gatgccgaga ggacgcgatg ggcccgtctt ccaggccgat gtggcccggt cagctggagc 4 92 0 

tttggatctt tttttttttt ttttcctcca gaagccctct cttgtccccg tcaccggggg 4980 

caccttacat ctgaggccta gaggacacga tgggcccggg ttccaggccg atgtggcccg 5040 

gtcagctgga gctttggatc tttttttttt ttttcttcca gaagccctct tgtccccgtc 5100 

accggtggca ctgtacatct gaggcggaga ggacattatg ggcccggctt ccaatccgat 516 0 

gtggcccggt cagctggagc tttggatctt attttttttt taattttttc ttccagaagc 5220 

cctcttgtcc ctgtcaccgg tggcacggta catctgaggc cgagaggaca ttatgggccc 5280 

ggcttccagg ccgatgtggc ccggtcagct ggagctttgg atcttttttt ttttttttct 534 0 

tttttcctcc agaagccctc tctgtccctg tcaccggggg ccctgtacgt ctgaggccga 5400 

gggaaagcta tgggcgcggt tttctttcat tgacctgtcg gtcttatcag ttctccgggt 546 0 

tgtcagggtc gaccagttgt tcctttgagg tccggttctt ttcgttatgg ggtcattttt 5520 

gggccacctc cccaggtatg acttccaggc gtcgttgctc gcctgtcact ttcctccctg 5580 

tctcttttat gcttgtgatc ttttctatct gttcctattg gacctggaga taggtactga 5640 

cacgctgtcc tttccctatt aacactaaag gacactataa agagaccctt tcgatttaag 570 0 

gctgttttgc ttgtccagcc tattcttttt actggcttgg gtctgtcgcg gtgcctgaag 5760 

ctgtccccga gccacgcttc ctgctttccc gggcttgctg cttgcgtgtg cttgctgtgg 5 82 0 

gcagcttgtg acaactgggc gctgtgactt tgctgcgtgt cagacgtttt tcccgatttc 5880 

cccgaggtgt cgttgtcaca cctgtcccgg ttggaatggt ggagccagct gtggttgagg 5 94 0 

gccaccttat ttcggctcac tttttttttt tttttttctc ttggagtccc gaacctccgc 6000 

tcttttctct tcccggtctt tcttccacat gcctcccgag tgcatttctt tttgtttttt 6060 

ttcttttttt tttttttttt ttggggaggt ggagagtccc gagtacttca ctcctgtctg 6120 

tggtgtccaa gtgttcatgc cacgtgcctc ccgagtgcac ttttttttgt ggcagtcgct 618 0 

cgttgtgttc tcttgttctg tgtctgcccg tatcagtaac tgtcttgccc cgcgtgtaag 6240 

acattcctat ctcgcttgtt tctcccgatt gcgcgtcgtt gctcactctt agatcgatgt 6300 

. ggtgctccgg agttctcttc gggccagggc caagccgcgc caggcgaggg acggacattc 6360 

atggcgaatg gcggccgctc ttctcgttct gccagcgggc cctcgtctct ccaccccatc 642 0 

cgtctgccgg tggtgtgtgg aaggcagggg tgcggctctc cggcccgacg ctgccccgcg 6480 

cgcacttttc tcagtggttc gcgtggtcct tgtggatgtg tgaggcgccc ggttgtgccc 6540 

tcacgtgttt cactttggtc gtgtctcgct tgaccatgtt cccagagtcg gtggatgtgg 660 0 
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ccggtggcgt 
gaggtgcfccc 
gcgctcccca 
tgtctgagaa 
tcgtcgggtg 
ggtcgcggct 
gagaggcctg 
aatgcccttg 
ttggtcttct 
gtcggggttt 
ggaaagggtg 
ctcgccccct 
ggcctccccg 
ccgttgctgc 
gcacaccccc 
tgggtaggcg 
fcccgtcgcgfc 
ctgcgccgcg 
ccccccttcc 
cctcggggtc 
gttctgtggg 
gccgctcggg 
cggtgtcgcc 
ggtgtggtgg 
tgccctgacc 
gaggggcccg 
ccccctcccc 
acccgtggcc 
cggtcaccgg 
gagctgtggt 
gagagggctg 
agtggtcatt 
ccggccctgt 
accctggcgg 
gatgtctacc 
cctcgttcct 
catctctcgc 
tcgccggggg 
ctcgccggct 
gacgttgcgc 
gagcccctgc 
tgtgtcgcgt 
gacgggtggc 
tcgttggfcgt 
fccgccggtgt 
cggcccggtg 
gggacggagg 
gttggctttg 
tccggccgca 
cctcccgcga 
cctggtcctg 
ggtagcafcat 
agtgaaactg 
ctacttggat 
tcccgggggg 
ctccggccgg 
acgccccccg 
tcgccgtgcc 
gagcctgaga 
cgacccgggg 
ggaatgagtc 
cagccgcggt 
tagfctggafcc 
ccccttgcct 
gtttactttg 
aggaataatg 
taagagggac 



tgcataccct 
t ggagcgt t c 
ttccctggtg 
gcccgtgaga 
aggcgcccac 
gggg 1 1 ggaa 
gctttcgggg 
gaagagaacc 
ggfcttccctg 
tgggtccgtc 
cgggcttctt 
gaccgcctcc 
ctccgagttc 
ggagcatgtg 
gcgtgcgcgt 
acggtgggct 
gcgtccctct 
cgtggtgcgfc 
cgcggcagcg 
gagagggtcc 
agaacggctg 
ggtcttcgtc 
tcctcgggct 
gactgctcag 
ggtccgacgc 
tttcggccgc 
gctcgccgca 
gtgctgtcgg 
ggtcttgggg 
ttggagggcg 
cgtgcgaggg 
gtcccgacgg 
cgtccgtcgg 
tgggattaao 
tccctctccc 
ccctctcgcg 
gcaatggcgc 
ctggccgctg 
tcgcggactc 
ctcgctgctg 
cgcacccgcc 

cgggagcgtg 

ctatccaggg 
ggggagtgaa 
cgcgcttctc 
egg t egaegt 
ggagagcggg 
ccgcgtgcgt 
tgcactctcc 
ggctctccgc 
tcccaccccc 
gcttgtctca 
cgaatggctc 
aactgtggt a 
ggatgcgtgc 
gggtcgggcg 
tggeggegae 
taccatggtg 
aacggctacc 
aggtagtgac 
cactttaaat 
aattccagct 
ttgggagcgg 
ctcggcgccc 
aaaaaattag 
gaataggacc 
ggceggggge 



tcccgtctgg 
caggtttgtc 
tgcctccggt 

ggggggtcga 

cccgcgacta 
agtttctcga 
gggaccggtt 
ttcctgttgc 
tgtgctcgtc 
ccgccctcag 
aeggtctega 

c g c g c g c g ca 

ggggagggat 
geteggcttg 
actttcctcc 
cccgggtccc 
cgctcgcgtc 
gctgtgtgct 
ttcccacggc 
gtgtctggcg 
ttggccgcgt 
ggtaggcatc 
cc egggggge 
gggagtggtg 
ccgagcggtc 
ccttgccgtc 
gccggtctfct 
accccccgca 

gggggecgag 

tcccggcccc 
gaaaaggttg 
tgtggtggtc 
gaaggcgcgt 
cccgcgcgcg 
cgaggtctca 
gggttcaagt 
cgcccgagtt 
tccggtctct 
ctggcttcgc 
tgtgcttggg 

ggtgtgcggt 

tccgcctcgc 
ctcgcccccg 
tggtgctacc 
tttccgccaa 
tccggctctc 
taagagaggt 
gtgetcgegg 
cgttccgcgc 
cgccgccgcc 
gacgctccgc 
aagattaagc 
atfcaaatcag 
attctagagc 
atttatcaga 
ccggcggctt 
gacccattcg 
accaegggtg 
acatccaagg 
gaaaaataac 
cctttaacga 
ecaatagegt 
gcgggcggtc 
cctcgatgct 
agtgttcaaa 
gcggttctat 
attegtattg 



tgtgtgcacg 
tcctaggtgc 
gctccgtctg 
ggagagaagg 
gtacgcctgt 
gagactcatt 
gcagggtctc 
cgcagacccc 
gcatgcatcc 
tgagaaagtt 
ggggtctctc 
gcgtttgctc 
cacgcggggc 
tgtggttggfc 
cctcctgagg 
cacccgtctt 
cacgactttg 
tetegggctg 
tggegaaate 
ttgattgatc 
ccggcgcgac 
ggtgtgtcgg 
cgtcgtgttt 
cagtgtgatt 
tctcggtccc 
gtcgccggcc 
tttcctctct 
tgggggegge 
gggtaagaaa 
gcggccgfcgg 
ccccgcgagg 
tgttggccga 
gttggggcct 
tgtcccggtg 
ggccttctcc 
cgctcgtcga 
cacggtgggt 
cctgcccgac 
ccggagggtc 

gggggcccgc 

ttcgcgccgc 
ggeggctaga 

ccgacccccg 
ggtcattccc 
cccccacgcc 
ccgatgccga 
gteggagage 
acgggttttg 
gagcgcccgc 
tcctcctcct 
tcgcgcttcc 
catgcatgtc 
ttatggttcc 
taatacatgc 
tcaaaaccaa 
ggtgactcta 
aacgtctgcc 
aeggggaate 
aaggcagcag 
aatacaggac 
ggatccattg 
atattaaagt 
cgccgcgagg 
cttagctgag 
gcaggcccga 
tttgttggtt 
cgccgctaga 



cgctgtttct 
ctgcttctga 
gctgtgtgcc 

aggggcaaga 

gegtaggget 
gctttcccgt 
ccctgtccgc 
cccgcgcggt 
tetcteggtg 
tccttctcta 
ccgaatggtc 
tctcgtctac 
agagcctgtc 
ggctggggag 
gccgccgtgc 
cccgtgcctc 
gccgctcccg 
tgtggttgtg 
gegggagtec 
tcgctctcgg 
gtcggacgtg 
catcggtctc 
egggtegget 
cccgccggtt 
ttgtgaggac 
ctcgttctgc 
ccccccctct 
egggcaegta 
gteggctegg 
cggtgtcttg 
gcaaagggaa 
ggtgcgtctg 
gccggagtgc 
tggcggtggg 
gcgcgggctc 
cctcccctcc 
tcgtcctccg 
ccccgttggc 

agggggcttc 

tgcggcctcc 
ggtcagttgg 
cgcgggtgtc 
cctgcccgtc 
tcccgcgtgg 
aacccaccac 

ggggttcggg 

tgtcccgggg 
tcggaccccg 
ccggctcacc 
ctctcgcgct 
ttacctggtt 
t aagt aegea 
tttggtcgct 
cgacgggcgc 
cccggtgagc 
gataacctcg 
ctatcaactt 
agggttcgat 
gcgcgcaaat 
tctttcgagg 
gagggcaagt 
tgctgcagtt 
cgagtcaccg 
tgtcccgcgg 
gccgcctgga 
ttcggaactg 
ggtgaaattc 



tgtaagcgtc 
gctggtggtg 
ttcccgtttg 
ccccccttct 
ggtgcfcgagc 

ggggagcttt 

ggatgctcag 
cgcccgcgtg 
gccggggctc 
gctatcttcc 
ccctggaggg 
cgcggcccgc 
fcgtcgtcctg 
agggctccgt 
ggacggggtg 
acccgtgcct 
cgacggcggc 
tcgcctcgcc 
tccttcccct 
ggaegggace 
gggacccact 
tctctcgtgt 
cggcgctgca 
ttgcctcgcg 
ccccttccgg 
tgtgtcgttc 
cctctgactg 
cgcgtccggg 
egggegggag 

cgcggtcttg 

agaggctagc 

gggggctcgt 

cgaggtgggt 
ggctccggtc 
tcggccctcc 
tccgtccttc 
cctccgcttc 
gtggtcttct 
ccggttcccc 
gcccgcccgt 
gccctggcgt 
gc eggge t cc 
ccggtggtgg 
tttgactgtc 
cctgctctcc 
atttgtgccg 
cgacgctcgg 
aeggggtegg 
cccggtttgt 
ctctgtcccg 
gatcctgcca 
cggccggtac 
cgctcctctc 
tgacccccct 
tccctcccgg 
ggccgatcgc 
tcgatggtag 
teeggagagg 
tacccactcc 
cccbgtaatt 
ctggtgccag 
aaaaagctcg 
cccgtccccg 
ggcccgaagc 
taccgcagct 
aggecatgat 
ttggaccggc 



6660 
6720 
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6900 
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7080 
7140 
7200 
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7500 
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7800 
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8400 
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8580 
8640 
8700 
8760 
8820 
8880 
8940 
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9060 
9120 
9180 
9240 
9300 
9360 
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gcaagacgga 
tcggaggttc 
cgatgcggcg 
ggttccgggg 
ccaggagtgg 
cggacaggat 
ttcttagttg 
ctaactagtt 
gcgttcagcc 
tgcacgcgcg 
aacccgttga 
gaatfccccag 
accgcccgtc 
ggtcggccca 
agtaaaagtc 
ctgtggagga 
cgcgtgcgt c 
gaaggggtgg 
tcccctctcc 
gcgtcttgcc 
ggtttttgac 
cccafcccccg 
ggatgtgagt 
gtcctccccg 
ccccgggggg 
cggtcgttcg 
cccgaggcgg 
cccgacccgc 
gggttcccgt 
cacgtgtctc 
cctctctctc 
cgtgagttcg 
tgcgtcgatg 
catcgacact 
cgtcggttga 
cfccgcagggc 
gggcggttgt 
cgcgctcgcg 
gcctcgcgtc 
fcgggaaccca 
gaggtfcggcg 
ggttgtcggg 
gtttgggtct 
ggcgccgcgc 
gtatccccgg 
cctcggtggg 
cgtggctctt 

ccgcgggacg 
ggga gggaga 
ctgtgggctg 
ccctcccgcc 
gccgggtgcc 
tgtcccccct 
attagtcagc 
gaagagccca 
gacccactcc 
tggacggtgt 
gttgcttggg 
cgagaccgat 
tcaagagggc 
gattcaaccc 
ccccgttcct 
gcctccggcg 
gggtcggcgg 
ggcggtgcgc 
gggggggcgg 

ggccgcgctt 



ccagagcgaa 
gaagacgatc 
gcgttattcc 
ggagtatggfc 
gcctgcggct 
tgacagattg 
gtggagcgat 
acgcgacccc 
acccgagafcfc 
ctacactgac 
accccattcg 
taagtgcggg 
gctactaccg 
cggccctggc 
gtaacaaggt 
gcggcggcgfc 
ccgggtcccg 

gtggggtcgg 

ctcgtccggc 
tctttcccgt 
ccgtcccggg 
ccgcggctct 
gtcgcgtgtg 
cfccctgtccc 
gtcgccctgc 
ggcggctctc 
cggtcgtgtg 
gccgccggcfc 
gfccgttcccg 
gtttcgttcc 
cggggagagg 
ctcacacccg 
aagaacgcag 
tcgaacgcac 
cgatcaatcg 
caacccccca 
cggtgtggcg 
gcttcttccc 
ggcgcctccc 
ccgcgccccc 
gttgagggtg 
gtggcggtcg 
tgcgctgggg 
accctccggc 
tggcgttgcg 
cgccttcgcg 
cttcgtctcc 
ccgcggcgtc 
gggcctcgct 
tgcgtcccgg 
ggcctctcgg 
gtctctttcc 
ttctgaccgc 
ggaggaaaag 
gcgccgaatc 
ccggcgccgc 
gaggccggta 
aatgcagccc 
agtcaacaag 
gtgaaaccgt 
ggcggcgcgc 
cecgaccccfc 
gcgggcgcgg 
gggaccgccc 
cgcgaccggc 
cgcgtctcag 
tcgccgaatc 



agcatttgcc 
agataccgtc 
catgacccgc 
tgcaaagctg 
taatttgact 
atagctcttt 
fcfcgtcfcggtt 
cgagcggtcg 
gagcaataac 
tggctcagcg 
tgatggggat 
tcataagctt 
attggatggt 
ggagcgcfcga 
ttccgtaggt 
ggcccgctct 
tcgcccgcgt 
tctgggtccg 
tctgacctcg 
ccggctcttc 
ggcgttcggt 
ggcttttcta 
ggctcgcccg 
gggtacctag 
cgcccccagg 
ccfccagactc 

ggggggtgga 

tgcccgattt 
tgtttttccg 
tgctggccgg 
agggcggtgg 
aaataccgat 
cfcagctgcga 
ttgcggcccc 
cgtcacccgc 
acccgggtcg 
cgcgcgcccg 
gctccgccgt 
ggacegcfcgc 
gtggcgcccg 
tgcgtgcgcc 
acgagggccg 
gaggcggggt 
ttgtgtggag 
agggagggtt 
ccgcacgcgg 
gcttctcctt 
cgtgcgecga 
gacccgttgc 

gggttgcgtg 

ggaccccctg 
cgcccgcctc 
gacctcagat 
aaactaacca 
cccgccgcgc 
tcgtgggggg 
gcggccccgg 
aaagcgggtg 
taccgtaagg 
taagaggtaa 
gtccggccgt 
ccacccgcgc 

ggggtggtgt 

ccggccggcg 
tccgggacgg 
ggcgcgccga 
ccggggccga 
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aagaatgttt 
gtagttccga 
cgggcagctt 
aaacttaaag 
caacacggga 
ctcgattccg 
aattccgata 
gcgtccccca 
aggtctgtga 
tgtgcctacc 

cggggattgc 

gcgttgatta 
ttagtgaggc 
gaagacggtc 
gaacctgcgg 
ccccgtcttg 
gtggagcgag 
tctgggaccg 
ccaccctacc 
cgtgtctacg 
cgtcggggcg 
cgttggctgg 
tcccgatgcc 
ctgtcgcgtt 
gtcggggggc 
catgaccctc 
tgtctggagc 
ccgcgggtcg 
ctcccgaccc 
cctgaggcta 
fccgfctggggg 
acgactctta 
gaattaatgt 
gggttcctcc 
tgcggtgggt 
ggccctccgt 
cgtcgcggag 
tcccgccctc 
ctcaccagtc 

ggggtgggcg 
gaggtggtgg 
gtcggtcgcc 
cgaccgctcg 
ggagagcgag 
tggcgtcccg 
ccgctagggg 
cacccgggcg 
tgcgagtcac 
gtcccggctt 
tgagtaagat 
agacggttcg 
ctcgctctct 
cagacgtggc 
ggattccctc 
gtcgcggcgt 
cccaagtcct 
cgcgccgggc 
gtaaactcca 
gaaagttgaa 
acgggtgggg 
gcccggtggt 
gtcgttcccc 
ggtggtggcg 
accggccgcc 
ccgggaaggc 
accacctcac 
ggaagccaga 



tcattaatca 
ccataaacga 
ccgggaaacc 
gaat tgacgg 
aacctcaccc 

tgggtggtgg 

acgaacgaga 
acttcttaga 
tgcccttaga 
cfcgcgccggc 
aattattccc 
agtccctgcc 
cctcggatcg 
gaacttgact 
aaggat cat t 
tgtgtgtcct 
gtgtctggag 
cctccgattt 
gcggcggcgg 

aggggcggta 

cgcgctttgc 
ggcggttgfcc 
acgctttfccfc 
ccggcgcgga 

ggtggggccc 

ctccccccgc 
cccctcgggc 
gtcctgtcgg 
tttttttttc 
cccctcggtc 
actgtgccgt 
gcggfcggatc 
gaattgcagg 
cggggc t acg 
gctgcgcggc 
ctcccgaagt 
cctggtctcc 
gcccgtgcac 
ttitcfccggtc 
cgtccgcatc 
tcggtcccct 
tgcggtggtt 
cggggttggc 
ggcgagaacg 
cgtccgtccg 
egg t cggggc 
gtacccgctc 
ccccgggtgfc 
ccctgggggg 
cctccacccc 
ccggctcgtc 
tcttcccgcg 
gacccgctga 
agtaacggcg 
gggaaatgtg 
tctgatcgag 
tegggtctte 
tctaaggcta 
aagaactttg 
tccgcgcagt 
cccggcggat 
tcttcctccc 
cgcgggcggg 
gccgggcgca 
ccggtgggga 
cccgagtgtt 
tacccgtcgc 



agaacgaaag 
tgccgactgg 
aaagtctttg 
aagggcacca 
ggee eggaca 
tgcatggccg 
ctctggcatg 
gggacaagtg 
tgfcccggggc 
aggegegggt 
catgaacgag 
ctttgtacac 
gccccgccgg 
atctagagga 
aaaegggaga 
egcegggagg 
tgaggtgaga 
cccctccccc 
ctgctcgcgg 
cgtcgttacg 
tctcccggca 
gcgtgtgggg 
ggcctcgcgt 
gg t 1 1 a agga 
gtagggaagt 
tgccgccgtt 
gccgtggggg 
tgccggtcgt 
ctccccccca 
catctgttct 
cgtcagcacc 
actcggctcg 
acacattgat 
ccfcgfccfcgag 
tgggagtttg 
tcagacgtgt 
cccgcgcatc 
cccggtcctg 
ccgtgccccg 
tgctctggtc 
gcggccgcgg 
gtctgtgtgt 
gcggtcgccc 
gagagaggtg 
tccctccctc 
ccgtggcccc 
cggcgccggc 
tgcgagttcg 
gacccggcgt 
cgccgccctc 
ctcccgtgcc 
gctgggcgcg 
atttaagcat 
agtgaacagg 
gcgtacggaa 
gcccagcccg 
ceggagtegg 
aataceggea 
aagagagagt 
ccgcccggag 
ctttcccgct 
cgcgtccggc 
geegggggtg 
cttccaccgt 
aggfcggctcg 
acagccctcc 
cgcgctctcc 



10680 
10740 
10800 
10860 
10920 
10980 
11040 
11100 
11160 
11220 
11280 
11340 
11400 
11460 
11520 
11580 
11640 
11700 
11760 
11820 
11880 
11940 
12000 
12060 
12120 
12180 
12240 
12300 
12360 
12420 
12480 
12540 
12600 
12660 
12720 
12780 
12840 
12900 
12960 
13020 
13080 
13140 
13200 
13260 
13320 
13380 
13440 
13500 
13560 
13620 
13680 
13740 
13800 
13860 
13920 
13980 
14040 
14100 
14160 
U4220 
14280 
14340 
14400 
14460 
14520 
14580 
14640 
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ctctcccccc gtccgcctcc cgggcgggcg tgggggtggg ggccgggccg cccctcccac X47 00 

ggcgcgaccg ctctcccacc cccctccgtc gcctctctcg gggcccggbg gggggcgggg 14760 

cggactgtcc ccagtgcgcc ccgggcgtcg tcgcgccgtc gggfccccggg gggaccgtcg 14820 

gtcacgcgtc tcccgacgaa gccgagcgca cggggtcggc ggcgatgtcg gctacccacc 14 880 

cgacccgtct tgaaacacgg accaaggagt ctaacgcgtg cgcgagtcag gggctcgtcc 14940 

gaaagccgcc gtggcgcaat gaaggtgaag ggccccgccc gggggcccga ggtgggatcc 15000 

cgaggcctct ccagtccgcc gagggcgcac caccggcccg tctcgcccgc cgcgccgggg 15060 

aggtggagca cgagcgtacg cgttaggacc cgaaagatgg tgaactatgc ttgggcaggg 15120 

cgaagccaga ggaaactctg gtggaggtcc gtagcggtcc tgacgtgcaa atcggtcgtc 15180 

cgacctgggt ataggggcga aagactaatc gaaccatcta gtagctggtt ccctccgaag 1524 0 

tttccctcag gatagctggc gctctcgctc ccgacgtacg cagttttatc cggtaaagcg 15300 

aatgattaga ggtcfctgggg ccgaaacgat ctcaacctat tctcaaactt taaatgggta 15360 

agaagcccgg ctcgctggcg tggagccggg cgtggaatgc gagtgcctag tgggccactt 15420 

ttggtaagca gaactggcgc tgcgggatga accgaacgcc gggttaaggc gcccgatgcc 15480 

gacgctcatc agaccccaga aaaggtgttg gttgatatag acagcaggac ggtggccatg 15540 

gaagfccggaa tccgctaagg agtgtgtaac aactcacctg ccgaatcaac tagccctgaa 15600 

aatggatggc gctggagcgt cgggcccata cccggccgtc gccgcagtcg gaacggaacg 15660 

ggacgggagc ggccgcgggt gcgcgtctct cggggtcggg ggtgcgtggc gggggcccgt 15720 

cccccgcctc ccctccgcgc gccgggttcg cccccgcggc gtcgggcccc gcggagccta 15780 

cgccgcgacg agtaggaggg ccgctgcggt gagccttgaa gcctagggcg cgggcccggg 15840 

tggagccgcc gcaggtgcag atcttggtgg tagtagcaaa tatfccaaacg agaactttga 15900 

aggccgaagt ggagaagggt tccatgtgaa cagcagttga acatgggtca gtcggtcctg 15960 

3-gagatgggc gagtgccgtt ccgaagggac gggcgatggc ctccgttgcc ctcggccgat 1602 0 

cgaaagggag tcgggttcag atccccgaat ccggagfcggc ggagatgggc gccgcgaggc 16080 

cagtgcggta acgcgaccga tcccggagaa gccggcggga ggcctcgggg agagttctct 16140 

tttctttgtg aagggcaggg cgccctggaa tgggttcgcc ccgagagagg ggcccgtgcc 16200 

ttggaaagcg tcgcggttcc ggcggcgt cc ggtgagctct cgctggccct tgaaaatccg 16260 

ggggagaggg tgtaaatctc gcgccgggcc gtacccatat ccgcagcagg tctccaaggt 16320 

gaacagcctc tggcatgttg gaacaatgta ggfcaagggaa gtcggcaagc cggatccgta 163 80 

acttcgggat aaggattggc tctaagggct gggtcggtcg ggctggggcg cgaagcgggg 16440 

ctgggcgcgc gccgcggctg gacgaggcgc cgccgccctc tcccacgtcc ggggagaccc 16500 

cccgtccttt ccgcccgggc ccgccctccc ctcttccccg cggggccccg tcgtcccccg 16560 

cgtcgtcgcc acctctcttc ccccctcctt cttcccgtcg gggggcgggt cgggggtcgg 16620 

cgcgcggcgc gggctccggg gcggcgggtc caaccccgcg ggggttccgg agcgggagga 16680 

accagcggtc cccggtgggg cggggggccc ggacactcgg ggggccggcg gcggcggcga 16740 

ctctggacgc gagccgggcc cttcccgtgg atcgcctcag ctgcggcggg cgtcgcggcc 16800 

gctcccgggg agcccggcgg gtgccggcgc gggtcccctc cccgcggggc ctcgctccac 16860 

ccccccatcg cctctcccga ggtgcgtggc gggggcgggc gggcgtgtcc cgcgcgtgtg 16920 

gggggaacct ccgcgtcggt gttcccccgc cgggtccgcc ccccgggccg cggttttccg 16980 

cgcggcgccc ccgcctcggc cggcgcctag cagccgactt agaactggtg cggaccaggg 17040 

gaatccgact gtttaattaa aacaaagcat cgcgaaggcc cgcggcgggt gttgacgcga 17100 

tgtgatttct gcccagtgct ctgaatgtca aagtgaagaa attcaatgaa gcgcgggtaa 17160 

acggcgggag taactatgac tctcttaagg tagccaaatg cctcgtcatc taattagtga 17220 

cgcgcatgaa tggatgaacg agattcccac tgtccctacc tactatccag cgaaaccaca 17280 

gccaagggaa cgggcttggc ggaatcagcg gggaaagaag accctgttga gcttgactct 17340 

agtctggcac ggtgaagaga catgagaggt gtagaataag tgggaggccc ccggcgcccg 17400 

gccccgtcct cgcgtcgggg tcggggcacg ccggcctcgc gggccgccgg tgaaatacca 17460 

ctactctcat cgttttttca ctgacccggt gaggcggggg ggcgagcccc gaggggctct 17520 

cgcttctggc gccaagcgtc cgtcccgcgc gtgcgggcgg gcgcgacccg ctccggggac 17580 

agtgccaggt ggggagtttg actggggcgg tacacctgtc aaacggtaac gcaggtgtcc 17640 

taaggcgagc tcagggagga cagaaacctc ccgtggagca gaagggcaaa agctcgcttg 17700 

atcttgattt tcagtacgaa tacagaccgt gaaagcgggg cctcacgatc cttctgacct 17760 

tttgggtttt aagcaggagg tgtcagaaaa gttaccacag ggataactgg cttgtggcgg 17820 

ccaagcgttc atagcgacgt cgctttttga tccttcgatg tcggctcttc ctatcattgt 17880 

gaagcagaat tcaccaagcg ttggattgtt cacccactaa tagggaacgt gagctgggtt 17940 

tagaccgtcg tgagacaggt tagttttacc ctactgatga tgtgttgttg ccatggtaat 18000 

cctgctcagt acgagaggaa ccgcaggttc agacatttgg tgtatgtgct tggctgagga 18060 

gccaatgggg cgaagctacc atctgtggga ttatgactga acgcctctaa gtcagaatcc 18120 

gcccaagcgg aacgatacgg cagcgccgaa ggagcctcgg ttggccccgg atagccgggt 18180 

ccccgtccgt cccgctcggc ggggtccccg cgtcgccccg cggcggcgcg gggtctcccc 18240 

ccgccgggcg tcgggaccgg ggtccggtgc ggagagccgt tcgtcttggg aaacggggtg 183 00 

cggccggaaa gggggccgcc ctctcgcccg tcacgttgaa cgcacgttcg tgtggaacct 183 60 

ggcgctaaac cattcgtaga cgacctgctt ctgggtcggg gtttcgfcacg tagcagagca ' 1842 0 

gctccctcgc tgcgatctat tgaaagtcag ccctcgacac aagggtttgt ctctgcgggc 18480 

tttcccgtcg cacgcccgct cgctcgcacg cgaccgtgtc gccgcccggg cgtcacgggg 18540 

gcggtcgcct cggcccccgc gcggttgccc gaacgaccgt gtggtggttg ggggggggat 18600 

cgtcttctcc tccgtctccc gaggacggtt cgtttctctt tccccttccg tcgctctcct 18660 
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tgggtgtggg 
cttgccctcc 
ccgcggcgcg 
gttggagggg 
gccggggggg 
gcggcggcga 
gcggcgcctc 
actttgtttt 
cccccccccc 
tttttttttt 
catataggtc 
gaccagatat 
tttttttttt 
atatacttat 
tccaccgatg 
ccgcgacgcg 
tggaacctta 
tactttgtct 
ataaattatc 
tttgtgttgt 
tttgtgttgt 
gttgggttgg 
ttgtttgctg 
tacacaaaca 
tatccctttc 
tgtgtgtgtg 
tacttataat 
gcagacttct 
aataaataca 
gttgaccagt 
aatagataga 
attaaccact 
atttgaactc 
tggtctgtct 
tgcttttttt 
tctgtagacc 
caattttgga 
attttattat 
attagttgga 
ttgtggggct 
gatttttgta 
tttcattgct 
ccagttcctc 
gatgtgctag 
aaaagttcta 
tgttctcact 
agacatatat 
ttcccagacg 
caccacaact 
ggaaaagcat 
ggttgtgaac 
agtcagggct 
tgaatgatcc 
ataaaataat 
ctcacagcac 
gaggggtggg 
atggcctggt 
tacctgaagt 



agcctcgtgc 
ggccttggcc 
gtgacgcacg 
cgggaggggt 
cgctctctcc 
cgtgcgtacg 
ttccattttt 
ttttttttcc 
ccccccggcg 
ttaaattcct 
gaccagtact 
c cgaaagt cc 
tttggtgtgc 
aggaggaggt 
a t ggaggt eg 
geggge t cac 
aggtcgacca 
ttttctgaaa 
tgatctagat 
tttgttttgt 
gttgtgttgt 
gttgggttgt 
ttgttttgtg 
tgcacttttt 
cttctctctc 
tgcgtgtgtg 
aataggtege 
gagttcgagg 
tacatacata 
tgtcaatcct 
tggatagagt 
tttccctttt 
aggaccctgg 
gctgtttgtt 
tttcttctga 
agcctggcct 
gtaaaggtgt 
tagacagaac 
ccaattagtt 
ggggatcagg 
aagattactt 
tcatttctat 
ctgccttctg 
tgaaccagag 
acaaagtgat 
ctgccaccaa 
tttttctttt 
gecttttgag 
ctaacctgtt 
gtagcagttg 
cacccaccat 
ctaaaccgat 
cagcatggga 
gaaatgaatg 
ctccccctcc 
gtgggggcag 
tctctgaact 
ccctgagtga 



cgtcgcgacc 
aagceggagg 
gtgggatccc 
ttttcccgtg 
gcccgagcat 
aggggaggat 
tcccccccaa 
cccgatgctg 
eggageggeg 
ggaaccttta 
ccgggtggta 
tctctttccc 
ctctttttga 
cgaccagtac 
accagatgtc 
tctggactct 
gttgtccgtc 
ategcagagg 
ttgtttttct 
tttgttttgt 
gttgtgttgg 
gttgtttggt 
ttttgcgggt 
ttaaaataaa 
ttttttaaaa 
tgtgtgtgtg 
cgggtggtgg 
ccagcctggt 
catacataca 
ttagaatttt 
gatacaaata 
taggtttttt 
caggtcaact 
tgtttgcttg 
gacagtattt 
caatcgaact 
gctacaccac 
gaaatcaact 
ggctggtttg 
tatctcaacg 
ttcttagtct 
ttctctttct 
gaagatgtag 
agtttggatg 
ctttaacttt 
cgcgctttgt 
ggttttgctt 
aataaaatgg 
tggctgtttt 
taggacacac 

gtggttgcct 

gagccatctc 
agacagtctg 
aagtctccac 
cccacactgc 
ggatctgeat 
gttgagcett 
tgatttccct 



gcggcctgcc 
geggaggagg 
catcctcggc 
aacgccgcgt 
ccccactccc 
gtcgcggtgt 
etteggaggt 
gaggtcgacc 
gggccactct 
ggtcgaccag 
ctttgtcttt 
tttactcttc 
cttatataca 
tccgggcgac 
cgaaagtgtc 
tttttttttt 
tttcactcat 
tcgaccagat 
gtttttcagt 
tttgttttgt 
gttgggttgg 
tttgtgttgt 
cgaacagttg 
tttttaaaat 
attttctttg 
cgtgcagcgt 
tagcttcccg 
ctacagagga 
tacatacata 
gtttttaatt 
taggtttttt 
tttttttccc 
ggaaaacgtg 
ettgettget 
ctctgtgtaa 
cagaaatcct 
tgcctggcat 
agttggtcct 
ggaggtttct 
gaatgcatga 
gaggaaaaaa 
ttctttcttt 
geattgeatt 
teaagcegta 
tttttttttt 
acattgaatg 
gacatggttt 
gaggecagaa 
ccttcccaag 
tagacgagag 
gggatttgaa 
tccagccctc 
ccctctttgt 
gtatttattt 
ctttctccct 
gtcttcttgc 
gtctatccag 
gtgaattc 



gtcgcctgcc 
gggatcggcg 
gcgtccgtcg 
tcggcgccag 
gcccctcctc 
ggaggeggag 
cgaccagtac 
agatgtccga 
ggactctttt 
ttgtccgtct 
ttctgaaaat 
cccacagcga 
tgtaaatagt 
actttgtttt 
ccgtcccccc 
tttttttttt 
tcatataggt 
gtcagaaagt 
tttgtgttgt 
tttgttttgt 
gttgggttgg 
ttggtgttgt 
tccctaaccg 
aaatgcgaaa 
tgtgtgtgtg 
gcgcgcgc t c 
gactccagag 
accctgtctc 
catacataca 
aatgtgatag 
tttcagtaaa 
ctgtccatgt 
ttttctatat 
tgcttgcttg 
cctggtgccc 
cctgcctctt 
tattatcatt 
gtttcgttaa 
tttgtttccg 
aggttaaggt 
taaaataata 
ctttcagata 
gggaaaagca 
taatgtttat 
tttctccttc 
tgagctttgt 
ccctttctat 
ccaaagtctt 
gcacagatct 
caccagatct 
ctcaggatct 
ctacattcct 
ggtatatcac 
cttcgagcta 
atgtttgggt 
aggtctgtga 
aggctgactg 



gccgcagccc 
gcggcggcga 
gggacggccg 
gcctctggcg 
ttcgcgcgcc 
agggtcegge 
tccgggcgac 
aagtgtcccc 
tttttttttt 
tttactcctt 
cccagaggtc 
ttctcttttt 
gtgtacgttt 
tttttttttt 
cctccccccc 
tttaaatttc 
cgaccggtgg 
ctggtggtcg 
tttgtgttgt 
tttgttttgt 

gttgggttgg 

tggttttgtt 
agtttttttg 
atcgaccaat 
tgtgtgtgtg 
gttttataaa 
gcagaggcag 
gaaaaatgaa 
tacatatgag 
agagatagat 
tatgaggttg 
ggttgctggg 
atataaatag 
ettgettget' 
tgaaactcac 
gtctacctcc 
atcattatta 
ttcatttgaa 
atttgggtgt 
gagatggctc 
ttgggctacg 
aggaggtegg 
ttgtttgaga 
tacaatatag 
tacttctact 
tttgettaac 
ccgtgcaggg 
ttgaataaag 
ttcccagcat 
cattgtgggt 
tcagaagacg 
tcttaaggca 
catatactca 
tctaaattct 

ggggctgggg 

actatttgeg 
gctagttttc 



18720 
18780 
18840 
18900 
18960 
19020 
19080 
19140 
19200 
19260 
19320 
19380 
19440 
19500 
19560 
19620 
19680 
19740 
19800 
19860 
19920 
19980 
20040 
20100 
20160 
20220 
20280 
20340 
20400 
20460 
20520 
20580 
20640 
20700 
20760 
20820 
20880 
20940 
21000 
21060 
21120 
21180 
21240 
21300 
21360 
21420 
21480 
21540 
21600 
21660 
21720 
21780 
21840 
21900 
21960 
22020 
22080 
22118 



<210> 19 
<211> 175 
<212> DNA 

<213> Mus mus cuius 



<400> 19 

ctcccgcgcg gcccccgtgt 
cgtgtgcgtg tccgtgcagt 



tcgccgttcc cgtggcgcgg acaatgeggt tgtgcgtcca 
gccgttgtgg agtgcctcgc tctcctcctc ctccccggca 



60 
120 
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gcgttcccac ggttggggac caccggtgac ctcgccctct tcgggcctgg atccg 

<210> 20 

<211> 755 

<212> DNA 

<213> Mua musculus 



175 



<400> 20 

ggfcctggtgg 

gtcgtgcccg 

gcggcgttgg 

ttcggggccg 

ggcggtgtga 

gttcgtgtct 

cgggggacgt 

tccccttccc 

gcccggccgt 

ggcggccact 

tgcccccgcg 

gaaggctgcg 

gtccttcgtc 



gaattgttga 
gcgccggacg 
tagtctcccg 
gcgttgcttg 
ttcccgccgg 
cgggagcggt 
tcccgtcgcc 
cgtttcgccg 
gctgccggac 
gtggtccggg 
ggctcccgtg 
cacgttgtcg 
gtcccgtcgg 



cctcgctctc 
tgtgtcgggg 
tgttgcgtct 
gcttacgcag 
ttttgcctcg 
ggtttttttt 
ccctgccgcc 
tcggttctcc 
ccccccttct 
agctgctcgg 
gccgacgcgg 
gtccttgcga 
cggtggatcc 



gggtgcggcc 

cccacttccc 
tcccgggctc 
gcttggtttg 
cgtctgcctg 
tttttcgggt 
ggtgggtttt 
ccggt cggt c 

gggggggatg 

caggcgggtg 
cgtgttcttt 
gggaaagagg 
ggcct 



tttggggaac 
gctcgagggt 
ttgggggggg 
ggactgcctc 
ctttgcctcg 
cccggggaga 
cgtttcgggc 
ggccctctcc 
cccgggcacg 
agccagttgg 
gggggggcct 
cttttttttt 



ggcggggtcg 

ggcggtggcg 

tgccgtcgtt 
aggagtcgtg 
ggtttgcttg 

ggggtttttc 

tgtgttcgtt 
c cggt cggt c 
cacgcgtccg 
aggggcgtca 
gtgcgtgcgg 
ttagggggtc 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
755 



<210> 21 

<211> 463 

<212> DNA 

<213> Mus musculus 



<400> 21 

ggccgaggtg 

tagcgggtac 

ttctggttgt 

caggtcagcc 

agcgagcccg 

ccccggggtt 

ctggttccgg 

cgggtctccc 



cgtctgcggg 
cgtcgccgcg 
tggcggcggg 
tccgcctgtg 
tccgttcgac 
ttcacggcgc 
tctccccgcc 
aacccccggc 



ttggggctcg 

ccgaggtggg 
ggc t ccggt c 
ggcttcgtcg 
cttccttccg 
cccccacgct 
aaaccccggt 
cggaagggtt 



tccggccccg 
cgcacgtcgg 
gatgtcttcc 
gccgtctccc 
ccttcccccc 
cctccgcctc 
tgggttggtc 
cgggggttcc 



tcgtcctccg 
tgagat aacc 
cctccccctc 
cccccctcac 
atctttccgc 
tccgcccgtg 
tccggccccg 

ggg 



ggaaggcgtt 
ccgagcgtgt 
tccccgaggc 
gtccctcgcg 
gctccgttgg 
gtttggacgc 
gcttgctctt 



60 
120 
180 
240 
300 
360 
420 
463 



<210> 22 

<211> 378 

<212> DMA 

<2 13 > Mus musculus 



<400> 22 

ggattcttca 

ggcggccc eg 

cggcgacgac 

catggtgacc 

ggctaccaca 

tagtgacgaa 

tttaaatcct 



ggat tgaaac 

ggcggtttgg 

ccattcgaac 
acgggtgacg 
tccaaggaag 
aaataacaat 
ttaagcag 



ccaaaccggt 
tgagttagat 
gtctgcccta 
gggaatcagg 
geagcaggeg 
acaggactct 



tcagtttcct 
aacctcgggc 
tcaactttcg 
gttcgattcc 
egcaaattae 
ttcgaggccc 



ttccggctcc 
cgatcgcacg 
atggtagtcg 
ggagagggag 
ccactcccga 
tgtaattgga 



ggccgggggg 
ccccccgtgg 
atgtgcctac 
cctgagaaac 
cceggggagg 
atgagtccac 



60 
120 
180 
240 
300 
360 
378 



<210> 23 

<211> 378 

<212> DNA 

<213> Mus musculus 



<400> 23 

gatccattgg 

tattaaagtt 

gccgcgaggc 

ttagctgagt 

aagcaggccc 

cctattttgt 

ecttattgeg 



agggcaagtc 
getgeagtta 
gagtcaccgc 
tgtcccgcgg 
gagccgcctg 
ttggttttcg 
ccccccta 



tggtgccagc 
aaaagctcgt 
ccgtccccgc 
ggcccgaagc 
gataccgcca 
gaactgagee 



agecgeggta 
agttggatct 
cccttgcctc 
gtttactttg 
gctaggaaat 
catgattaag 



attccagctc 
tgggagcggg 
tcggcgcccc 
aaaaaattag 
aatggaatag 
ggaaaeggee 



caatagegta 
cgggcggtcc 
ctcgatgctc 
agttgtttca 
gaecgeggtt 

gggggcattc 



60 
120 
180 
240 
300 
360 
378 



<210> 
<211> 



24 
719 
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<212> DNA 

<213> Mus musculus 

<400> 24 

ggatctttcc cgctccccgt tcctcccggc ccctccaccc gcgcgtctcc ccccttcttt 60 

tcccctctcc ggaggggggg gaggtggggg cgcgtgggcg gggtcggggg tggggtcggc 120 

gggggaccgc ccccggccgg caaaaggccg ccgccgggcg cacttcaacc gtagcggtgc 180 

gccgcgaccg gctacgagac ggctgggaag gcccgacggg gaatgtggct cggggggggc 240 

ggcgcgtctc agggcgcgcc gaaccacctc accccgagtg ttacagccct ccggccgcgc 300 

tttcgcggaa tcccggggcc gaggggaagc ccgatacccg tcgccgcgct tttcccctcc 360 

ccccgtccgc ctcccgggcg ggcgtggggg tgggggccgg gccgcccctc ccacgcccgt 420 

ggtttctctc tctcccggtc tcggccggtt tggggggggg agcccggttg ggggcggggc 480 

ggactgtcct cagtgcgccc cgggcgtcgt cgcgccgtcg ggcccggggg gttctctcgg 54 0 

tcacgccgcc cccgacgaag ccgagcgcac ggggtcggcg gcgatgtcgg ctacccaccc 600 

gacccgtctt gaaacacgga ccaaggagtc taacgcgtgc gcgagtcagg ggctcgcacg 660 

aaagccgccg tggcgcaatg aaggtgaagg gccccgtccg ggggcccgag gtgggatcc 719 

<210> 25 

<211> 685 

<212> DNA 

<213> Mus musculus 

<400> 25 

cgaggcctct ccagtccgcc gagggcgcac caccggcccg tctcgcccgc cgcgtcgggg 6 0 

aggtggagca cgagcgtacg cgttaggacc cgaaagatgg tgaactatgc ctgggcaggg 120 

cgaagccaga ggaaactctg gtggaggtcc gtagcggtcc tgacgtgcaa atcggtcgtc 180 

cgacctgggt ataggggcga aagactaatc gaaccatcta gtagctggtt ccctccgaag 240 

tttccctcag gatagctggc gctctcgcaa ccttcggaag cagttttatc cgggtaaagg 300 

cggaatggat taggaggtct tggggccgga aacgatctca aactatttct caaactttaa 3 60 

atgggtaagg aagcccggct cgctggcgtg gagccgggcg tggaatgcga gtgcctagtg 420 

ggccactttt ggtaagcaga actggcgctg cgggatgaac cgaacgccgg gttaaggcgc 480 

ccgatgccga cgctcatcag accccagaaa aggtgttggt tgatatagac agcaggacgg 540 

tggccatgga agtcggaafcc cgctaaggag tgtgtaacaa ctcacctgcc gaatcaacta 60 0 

gccctgaaaa tggatggcgc tggagcgtcg ggcccatacc cggccgtcgc cggcagtcgg 660 

aacgggacgg gacgggagcg gccgc 685 

<210> 26 
<211> 5162 
<212> DNA 

<213> Artificial Sequence . 
<220> 

<223> Chimeric bacterial plasmid 
<400> 26 

gacggatcgg gagatctccc gatcccctat ggtcgactct cagtacaatc tgctctgatg 60 

ccgcatagtt aagccagtat ctgctccctg cttgtgtgtt ggaggtcgct gagtagtgcg 12 0 

cgagcaaaat ttaagctaca acaaggcaag gcttgaccga caattgcatg aagaatctgc 18 0 

ttagggttag gcgttttgcg ctgcttcgcg atgtacgggc cagatatacg cgttgacatt 24 0 

gattattgac tagttattaa tagtaatcaa ttacggggtc attagttcat agcccatata 3O0 

tggagttccg cgttacataa cttacggtaa atggcccgcc tggctgaccg cccaacgacc 360 

cccgcccatt gacgtcaata atgacgtatg ttcccatagt aacgccaata gggactttcc 420 

attgacgtca atgggtggac tatttacggt aaactgccca cttggcagta catcaagtgt 480 

atcatatgcc aagtacgccc cctattgacg tcaatgacgg taaatggccc gcctggcatt 54 0 

atgcccagta catgacctta tgggactttc ctacttggca gtacatctac gtattagtca 60 0 

tcgctattac catggtgatg cggttttggc agtacatcaa tgggcgtgga tagcggtttg 660 

actcacgggg atttccaagt ctccacccca ttgacgtcaa tgggagtttg ttttggcacc 72 0 

aaaatcaacg ggactttcca aaatgtcgta acaactccgc cccattgacg caaatgggcg 78 0 

gtaggcgtgt acggtgggag gtctatataa gcagagctct ctggctaact agagaaccca 84 0 

ctgcttactg gcttatcgaa attaatacga ctcactatag ggagacccaa gcttggtacc 90 0 

gagctcggat cgatatctgc ggccgcgtcg acggaattca gtggatccac tagtaacggc 96 0 

cgccagtgtg ctggaattaa ttcgctgtct gcgagggcca gctgttgggg tgagtactcc 102 0 

ctctcaaaag cgggcatgac ttctgcgcta agattgtcag tttccaaaaa cgaggaggat 108 0 

ttgatattca cctggcccgc ggtgatgcct ttgagggtgg ccgcgtccat ctggtcagaa 114 0 

aagacaatct ttttgttgtc aagcttgagg tgtggcaggc ttgagatctg gccatacact 120 0 

tgagtgacaa tgacatccac tttgcctttc tctccacagg tgtccactcc caggtccaac 126 0 

tgcaggtcga gcatgcatct agggcggcca attccgcccc tctccctccc ccccccctaa 132 0 
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cgttactggc cgaagccgct tggaataagg ccggtgtgcg tttgtctata tgtgattttc 1380 

caccatattg ccgtcttttg gcaatgtgag ggcccggaaa cctggccctg tcttcttgac 1440 

gagcattcct aggggtcttt cccctctcgc caaaggaatg caaggtctgt tgaatgtcgt 150O 

gaaggaagca gttcctctgg aagcttcttg aagacaaaca acgtctgtag cgaccctttg 1560 

caggcagcgg aaccccccac ctggcgacag gtgcctctgc ggccaaaagc cacgtgtata 1620 

agatacacct gcaaaggcgg cacaacccca gtgccacgtt gtgagttgga tagttgtgga 1680 

aagagtcaaa tggctcfccct caagcgtatt caacaagggg ctgaaggatg cccagaaggt 1740 

accccattgt atgggatctg atctggggcc tcggtgcaca tgctttacat gtgttfcagtc 1800 

gaggttaaaa aaacgtctag gccccccgaa ccacggggac gtggttttcc tttgaaaaac i86 0 

acgatgataa gcttgccaca acccgggatc caccggtcgc caccatggtg agcaagggcg 1920 

aggagctgfct caccggggtg gtgcccatcc tggtcgagcfc ggacggcgac gtaaacggcc 19 80 

acaagttcag cgtgtccggc gagggcgagg gcgatgccac ctacggcaag ctgaccctga 2 04 0 

agttcatctg caccaccggc aagctgcccg tgccctggcc caccctcgtg accaccctga 2100 

cctacggcgt gcagtgcttc agccgctacc ccgaccacat gaagcagcac gacttcttca 2160 

agtccgccat gcccgaaggc tacgtccagg agcgcaccat cttcttcaag gacgacggca 222 0 

actacaagac ccgcgccgag gfcgaagfctcg agggcgacac cctggtgaac cgcatcgagc 22 80 

tgaagggcat cgacttcaag gaggacggca acatcctggg gcacaagctg gagtacaact 2 340 

acaacagcca caacgtctat atcatggccg acaagcagaa gaacggcatc aaggtgaact 24 OO 

tcaagatccg ccacaacatc gaggacggca gcgtgcagct cgccgaccac taccagcaga 24 60 

acacccccat cggcgacggc cccgtgctgc tgcccgacaa ccactacctg agcacccagt 2520 

ccgccctgag caaagacccc aacgagaagc gcgatcacat ggtcctgctg gagttcgtga 2 5 80 

ccgccgccgg gatcactctc ggcatggacg agctgfcacaa gtaaagcggc cctagagctc 2 640 

gctgatcagc ctcgactgtg cctctagttg ccagccatct gttgtttgcc cctcccccgt 2 70O 

gccttccttg accctggaag gtgccactcc cactgtcctt tcctaataaa atgaggaaat 2 760 

tgcatcgcat tgtctgagta ggtgtcattc tattcfcgggg ggtggggtgg ggcaggacag 2 820 

caagggggag gattgggaag acaatagcag gcatgctggg gatgcggfcgg gctctatggc 2 880 

ttctgaggcg gaaagaacca gctggggctc gagtgcattc tagfctgtggt ttgtccaaac 2 94 0 

tcatcaatgt atcttatcat gtctgtatac cgtcgacctc tagctagagc ttggcgtaat 300O 

catggtcata gctgtttcct gtgtgaaatt gttatccgct cacaattcca cacaacatac 3 060 

gagccggaag cataaagtgt aaagcctggg gtgcctaatg agtgagctaa ctcacattaa 3120 

ttgcgttgcg ctcactgccc gctttccagt cgggaaacct gtcgtgccag ctgcattaat 3180 

gaatcggcca acgcgcgggg agaggcggfcfc tgcgtatfcgg gcgctcttcc gcttcctcgc 3 240 

tcactgactc gctgcgctcg gtcgttcggc tgcggcgagc ggtatcagct cactcaaagg 3300 

cggtaatacg gttatccaca gaatcagggg ataacgcagg aaagaacatg tgagcaaaag 3 3 60 

gccagcaaaa ggccaggaac cgtaaaaagg ccgcgfctgct ggcgttfcfctc cataggctcc 34 2 O 

gcccccctga cgagcatcac aaaaatcgac gctcaagtca gaggtggcga aacccgacag 34 80 

gactataaag ataccaggcg tttccccctg gaagctccct cgtgcgctct cctgttccga 3 540 

ccctgccgct taccggatac ctgtccgcct ttctcccttc gggaagcgtg gcgctttctc 3 6 00 

aatgctcacg ctgtaggtat ctcagttcgg tgtaggtcgt tcgctccaag ctgggctgtg 3 660 

tgcacgaacc ccccgttcag cccgaccgct gcgccttatc cggtaactat cgtcttgagt 3 720 

ccaacccggt aagacacgac ttatcgccac tggcagcagc cactggtaac aggat'tagca 3 7 80 

gagcgaggta tgtaggcggt gctacagagt tcttgaagtg gtggcctaac tacggctaca 3 840 

ctagaaggac agtatttggt atctgcgctc tgctgaagcc agttaccttc ggaaaaagag 3 900 

ttggtagctc ttgatccggc aaacaaacca ccgctggtag cggtggtttt tttgtttgca 3 960 

agcagcagat tacgcgcaga aaaaaaggat ctcaagaaga tcctttgatc ttttctacgg 4020 

ggtctgacgc tcagtggaac gaaaactcac gttaagggat tttggtcatg agattatcaa 4 0 80 

aaaggatctt cacctagatc cttttaaatt aaaaatgaag ttttaaatca atctaaagta 4140 

tatatgagta aacttggtct gacagttacc aatgcttaat cagtgaggca cctatctcag 4200 

cgatctgtct atttcgttca tccatagttg cctgactccc cgtcgtgtag ataactacga 4260 

tacgggaggg cttaccatct ggccccagtg ctgcaatgat accgcgagac ccacgctcac 4 320 

cggctccaga tttatcagca ataaaccagc cagccggaag ggccgagcgc agaagtggtc 43 80 

ctgcaacttt atccgcctcc atccagtcta ttaattgttg ccgggaagct agagtaagta 4440 

gttcgccagt taatagtttg cgcaacgttg ttgccattgc tacaggcatc gtggtgtcac 4 5 00 

gctcgtcgtt tggtatggct tcattcagct ccggttccca acgatcaagg cgagttacat 4 560 

gatcccccat gttgtgcaaa aaagcggtta gctccttcgg tcctccgatc gttgtcagaa 4 620 

gtaagttggc cgcagtgtta tcactcatgg ttatggcagc actgcataat tctcttactg 4680 

tcatgccatc cgtaagatgc ttttcfcgtga ctggtgagta ctcaaccaag tcattctgag 4740 

aatagtgtat gcggcgaccg agttgctctt gcccggcgtc aatacgggat aataccgcgc 4 800 

cacatagcag aactttaaaa gtgctcatca ttggaaaacg ttcttcgggg cgaaaactct 4 860 

caaggatctt accgctgttg agatccagtt cgatgtaacc cactcgtgca cccaactgat 4 920 

cttcagcatc ttttactttc accagcgttt ctgggtgagc aaaaacagga aggcaaaatg 4 980 

cc'gcaaaaaa gggaataagg gcgacacgga aatgttgaat actcatactc ttcctttttc 5 040 

aatattattg aagcatttat cagggttatt gtctcafcgag cggatacata tttgaatgta . 5100 

tttagaaaaa taaacaaata ggggttccgc gcacatttcc ccgaaaagtg ccacctgacg 5160 

tc 5162 



<210> 27 
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<211> 5627 
<212> DNA 

<213> Artificial Sequence 
<220> 

<2 23> pMG plasmid from InvivoGen; IRES sequence modified 
EMCV nucleotides 2736-3308 



<400> 27 

caccggcgaa ggaggcctag atctatcgat tgtacagcta gctcgacatg ataagataca 60 

ttgatgagtt tggacaaacc acaactagaa tgcagtgaaa aaaatgcttt atttgtgaaa 12 0 

tttgtgatgc tattgcttta tttgtgaaat ttgtgatgct attgctttat ttgtaaccat 180 

tataagctgc aataaacaag ttaacaacaa caattgcatt cattttatgt tfccaggttca 240 

gggggaggtg tgggaggttt tttaaagcaa gtaaaacctc tacaaatgtg gtagatccat 3 00 

ttaaatgtta attaagaaca tgtgagcaaa aggccagcaa aaggccagga accgtaaaaa 3 60 

ggccgcgttg ctggcgtttt tccataggct ccgcccccct gacgagcatc acaaaaatcg 420 

acgctcaagt cagaggtggc gaaacccgac aggactataa agataccagg cgtttccccc 4 80 

tggaagctcc ctcgtgcgct ctcctgttcc gaccctgccg cttaccggat acctgtccgc 54 0 

ctttctccct tcgggaagcg tggcgctttc tcatagctca cgctgtaggt atctcagttc 60 0 

ggtgtaggtc gttcgctcca agctgggctg tgtgcacgaa ccccccgttc agcccgaccg 660 

ctgcgcctta tccggtaact atcgtcttga gtccaacccg gtaagacacg acttatcgcc 72 0 

actggcagca gccactggta acaggattag cagagcgagg tatgtaggcg gtgctacaga 7 80 

gttcttgaag tggtggccta actacggcta cactagaaga acagtatttg gtatctgcgc 840 

tctgctgaag ccagttacct tcggaaaaag agttggtagc tcttgatccg gcaaacaaac 90 0 

caccgctggt agcggtggtt tttttgtttg caagcagcag attacgcgca gaaaaaaagg 960 

atctcaagaa gatcctttga tcttttctac ggggtctgac gctcagtgga acgaaaactc 1 102 0 

acgttaaggg attttggtca tggcta'gtta attaagctgc aataaacaat cattattttc 1080 

attggatctg tgtgttggtt ttttgtgtgg gcttggggga gggggaggcc agaatgactc 114 0 

caagagctac aggaaggcag gtcagagacc ccactggaca aacagtggct ggactctgca 1200 

ccataacaca caatcaacag gggagtgagc tggatcgagc tagagtccgt tacataactt 12 60 

acggtaaatg gcccgcctgg ctgaccgccc aacgaccccc gcccattgac gtcaataatg 13 20 

acgtatgttc ccatagtaac gccaataggg actttccatt gacgtcaatg ggtggagtat 13 80 

ttacggtaaa ctgcccactt ggcagtacat caagtgtatc atatgccaag tacgccccct 144 0 

attgacgtca atgacggtaa atggcccgcc tggcattatg cccagtacat gaccttatgg 15 00 

gactttccta cttggcagta catctacgta ttagtcatcg ctattaccat ggtgatgcgg 15 60 

ttttggcagt acafccaatgg gcgtggatag cggtttgact cacggggatt tccaagtctc 1620 

caccccattg acgtcaatgg gagtttgttt tggcaccaaa atcaacggga ctttccaaaa 16 80 

tgtcgtaaca actccgcccc attgacgcaa atgggcggta ggcgtgtacg gtgggaggtc 174 0 

tatataagca gagctcgttt agtgaaccgt cagatcgcct ggagacgcca tccacgctgt 1800 

tttgacctcc atagaagaca ccgggaccga tccagcctcc gcggccggga acggtgcatt 18 60 

ggaacgcgga ttccccgtgc caagagtgac gtaagtaccg cctatagagt ctataggccc 192 0 

acccccttgg cttcttatgc atgctatact gtttttggct tggggtctat acacccccgc 19 80 

ttcctcatgt tataggtgat ggtatagctt agcctatagg tgtgggttat tgaccattat 204 0 

tgaccactcc cctattggtg acgatacttt ccattactaa tccataacat ggctctttgc 2100 

cacaactctc tttattggct atatgccaat acactgtcct tcagagactg acacggactc 2160 

tgtattttta caggatgggg tctcatttat tatttacaaa ttcacatata caacaccacc 222 0 

gtccccagtg cccgcagttt ttattaaaca taacgtggga tctccacgcg aatctcgggt 2 2 80 

acgtgttccg gacatgggct cttctccggt agcggcggag cttctacatc cgagccctgc 234 0 

tcccatgcct ccagcgactc atggtcgctc ggcagctcct tgctcctaac agtggaggcc 24 00 

agacttaggc acagcacgat gcccaccacc accagtgtgc cgcacaaggc cgtggcggta 24 6 0 

gggtatgtgt ctgaaaatga gctcggggag cgggcttgca ccgctgacgc atttggaaga 2 52 0 

cttaaggcag cggcagaaga agatgcaggc agctgagttg ttgtgttctg ataagagtca 2580 

gaggtaactc ccgttgcggt gctgttaacg gtggagggca gtgtagtctg agcagtactc 264 0 

gttgctgccg cgcgcgccac cagacataat agctgacaga ctaacagact gttcctttcc 2 70 0 

atgggtcttt tctgcagtca cccgggggat ccttcgaacg tagctctaga ttgagtcgac 2760 

gttactggcc gaagccgctt ggaataaggc cggtgtgcgt ttgtctatat gttattttcc 2 82 0 

accatattgc cgtcttttgg caatgtgagg gcccggaaac ctggccctgt cttcttgacg 2 88 0 

agcattccta ggggtctttc ccctctcgcc aaaggaatgc aaggtctgtt gaatgtcgtg 294 0 

aaggaagcag ttcctctgga agcttcttga agacaaacaa cgtctgtagc gaccctttgc 3 00 0 

aggcagcgga accccccacc tggcgacagg tgcctctgcg gccaaaagcc acgtgtataa 3 06 0 

gatacacctg caaaggcggc acaaccccag tgccacgttg tgagttggat agttgtggaa 312 0 

agagtcaaat ggctctcctc aagcgtattc aacaaggggc tgaaggatgc ccagaaggta 318 0 

ccccattgta tgggatctga tctggggcct cggtgcacat gctttacatg tgtttagtcg 324 0 

aggttaaaaa aacgtctagg ccccccgaac cacggggacg tggttttcct ttgaaaaaca 3 30 0 

cgataatacc atgggtaagt gatatctact agttgtgacc ggcgcctagt gttgacaatt 3 36 0 

aatcatcggc atagtatatc ggcatagtat aatacgactc actataggag ggccaccatg 342 0 

tcgactacta accttcttct ctttcctaca gctgagatca ccggtaggag ggccatcatg 3480 
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aaaaagcctg 

gtctccgacc 

ggagggcgtg 

tatgtttatc 

gaattcagcg 

gacctgcctg 

atcgctgcgg 

ggtcaataca 

tggcaaactg 

atgctfctggg 

aacaatgtcc 

ttcggggatt 

atggagcagc 

ctccgggcgt 

aatttcgatg 

gggactgtcg 

gtagaagtac 

tgagtcgaga 

tgatcagcct 

ccttccttga 

gcatcgcatt 

aagggggagg 

gcaataaaat 

taacatacgc 

ccagtgcaag 

cccgtcagtg 

gcaattgaac 

actggctccg 

tgaacgttct 

gcatctctcc 

gcgttctgcc 

taaagctcag 

gccggctctc 

ctgttctgcg 

gatttatcaa 

agggac acgt 



aactcaccgc 

tgatgcagct 

gatatgtcct 

ggcactttgc 

agagcctgac 

aaaccgaact 

ccgatcttag 

ctacatggcg 

tgatggacga 

ccgaggactg 

tgacggacaa 

cccaatacga 

agacgcgcta 

atatgctccg 

atgcagcttg 

ggcgtacaca 

tcgccgatag 

attcgctaga 

cgactgtgcc 

ccctggaagg 

gtctgagtag 

attgggaaga 

atctttattt 

tctccatcaa 

tgcaggtgcc 

ggcagagcgc 

cggtgcctag 

cctttttccc 

ttfctcgcaac 

ttcacgcgcc 

gcctcccgcc 

gtcgagaccg 

cacgctttgc 

ccgttacaga 

aaagagtgtt 

cgactactaa 



gacgtctgtc 
ctcggagggc 
gcgggtaaat 
atcggccgcg 
ctattgcafcc 
gcccgctgtt 
ccagacgagc 
tgatttcata 
caccgfccagt 
ccccgaagtc 
tggccgcata 
ggtcgccaac 
cttcgagcgg 
cattggtctt 
ggcgcagggt 
aatcgcccgc 
tggaaaccga 
gggccctatt 
ttctagfctgc 
tgccactccc 
gtgtcattct 
caatagcagg 
tcattacatc 
aacaaaacga 
agaacatttc 
acatcgccca 
agaaggtggc 
gagggtgggg 
gggtttgccg 
cgccgcccfca 
tgtggtgcct 
ggcctttgtc 
ctgaccctgc 
tccaagctgt 
gacttgtgag 
cctfccttctc 
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gcgaagtttc 

gaagaatctc 

agctgcgccg 

ctcccgattc 

tcccgccgtg 

ctgcaacccg 

gggttcggcc 

tgcgcgattg 

gcgtccgt eg 

cggcacctcg 

acageggtea 

atettcttefc 

aggcafceegg 

gaccaactct 

egatgegacg 

agaagegegg 

cgccccagca 

ctatagtgtc 

cagccatctg 

actgtccttt 

attctggggg 

catgegcagg 

tgtgtgttgg 

aacaaaacaa 

tctatcgaag 

cagtccccga 

gcggggtaaa 

gagaacegta 

ccagaacaca 

cctgaggccg 

cctgaactgc 

cggcgctccc 

ttgetcaact 

gaccggcgcc 

cgctcacaat 

tttcctacag 



tgatcgaaaa 

gtgctfctcag 

atggtttcta 

eggaagtget 

cacagggtgt 

tegeggaget 

cattcggacc 

ctgatcccca 

cgcaggctct 

tgcacgcgga 

ttgactggag 

ggaggccgtg 

agettgeagg 

atcagagctt 

caatcgtccg 

ccgtctggac 

ctcgtccgag 

acctaaatgc 

ttgtttgccc 

cctaataaaa 

gtggggtggg 

gcccaattgc 

ttttttgtgt 

actagcaaaa 

gatctgegat 

gaagttgggg 

ctgggaaagt 

tataagtgea 

gctgaagctt 

ccatccacgc 

gfcccgccgfcc 

ttggagccta 

etaegtcttt 

tacgtaagtg 

tgatacttag 

ctgagat 



gttcgacagc 

cttcgatgta 

caaagatcgt 

tgacattggg 

cacgttgcaa 

catggatgcg 

gcaaggaatc 

tgtgtatcac 

cgatgagctg 

tttcggctcc 

egaggegatg 

gttggcttgt 

atcgccgcgg 

ggttgacggc 

atccggagcc 

cgatggctgt 

ggcaaaggaa 

tagagctege 

ctcccccgtg 

tgaggaaatt 

gcaggacagc 

tcgagcggcc 

gaategtaac 

taggctgtcc 

cgctccggtg 

ggaggggtcg 

gatgtcgtgt 

gtagtcgccg 

cgaggggctc 

cggttgagtc 

taggtaagtt 

cctagactca 

gtttcgtttt 

atatctacta 

attcatcgag 



3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

4020 

4080 

4X40 

4200 

4260 

4320 

4380 

4440 

4500 

4560 

4620 

4680 

4740 

4800 

4860 

4920 

4980 

5040 

5100 

5160 

5220 

5280 

5340 

5400 

5460 

5520 

5580 

5627 



<210> 28 
<211> 553 
<212> DNA 

<213> Artificial Sequence 



<223> pMG plasmid from InvivoGen: EMCV IRES sequence 



<400> 28 

aacgttactg 

tccaccatat 

acgagcattc 

gtgaaggaag 

tgeaggcage 

taagatacac 

gaaagagtca 

gtaccccatt 

tcgaggttaa 

cacgatgata 



gccgaagccg 
tgccgtcttt 
ctaggggtct 
cagttcctct 
ggaacccccc 
ctgeaaagge 
aatggctctc 
gtatgggatc 
aaaaegtcta 
ata 



cttggaataa 
tggcaatgtg 
ttcccctcfcc 
ggaagcttct 
acctggcgac 
ggcacaaccc 
etcaagegta 
tgatcfcgggg 
ggccccccga 



ggccggtgfcg 
agggecegga 
gecaaaggaa 
tgaagacaaa 
aggtgectet 
cagtgccacg 
ttcaacaagg 
cctcggtgca 
accaegggga 



cgtttgtcta 
aacctggccc 
tgeaaggtet 
caaegtctgt 
gcggccaaaa 
ttgtgagttg 
ggctgaagga 
catgetttae 
cgtggttttc 



tatgttattt 
tgtcttcttg 
gtfcgaatgfcc 
agcgaccctt 
gccacgtgta 
gafcagttgtg 
tgcccagaag 
gtgtgtttag 
ctttgaaaaa 



<210> 29 

<211> 4692 

<212> DNA 

<213> Artificial Sequence 

<220> ^ . 

<223> pDSredl-Nl plasmid from Clonfcecn 

tag?tat?aa tagtaatcaa ttacggggtc attagttcat ageccatata tggagttccg 



60 
120 
180 
240 
300 
360 
420 
480 
540 
553 



60 
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cgttacataa cttacggtaa atggcccgcc tggctgaccg cccaacgacc cccgcccatt 12 0 

gacgtcaata atgacgtatg ttcccatagt aacgccaata gggactttcc attgacgtca 18 0 

atgggtggag tatttacggt aaactgccca cttggcagta catcaagtgt atcatatgcc 24 0 

aagtacgccc cctattgacg tcaatgacgg taaatggccc gcctggcatt atgcccagta 3 00 

catgacctta tgggactttc ctacttggca gtacatctac gtattagtca tcgctattac 3 60 

catggtgatg cggttttggc agtacatcaa tgggcgtgga tagcggtttg actcacgggg 42 0 

atttccaagt ctccacccca ttgacgtcaa tgggagtttg ttttggcacc aaaatcaacg 4 80 

ggactttcca aaatgtcgta acaactccgc cccattgacg caaatgggcg gtaggcgtgt 54 0 

acggtgggag gtctatataa gcagagctgg tttagtgaac cgtcagatcc gctagcgcta 6 00 

ccggactcag atctcgagct caagcttcga attctgcagt cgacggtacc gcgggcccgg 6 60 

gatccaccgg tcgccaccat ggtgcgctcc tccaagaacg tcatcaagga gttcatgcgc 720 

ttcaaggtgc gcatggaggg caccgtgaac ggccacgagt tcgagatcga gggcgagggc 7 80 

gagggccgcc cctacgaggg ccacaacacc gtgaagctga aggtgaccaa gggcggcccc 840 

ctgcccttcg cctgggacat cctgtccccc cagttccagt acggctccaa ggtgtacgtg 900 

aagcaccccg ccgacatccc cgactacaag aagctgtcct tccccgaggg cttcaagtgg 960 

gagcgcgtga tgaacttcga ggacggcggc gtggtgaccg tgacccagga ctcctccctg 1020 

caggacggct get teat eta caaggtgaag- ttcatcggcg tgaacttccc ctccgacggc 108 0 

cccgtaatgc agaagaagac catgggctgg gaggcctcca ccgagcgcct gtacccccgc 114 0 

gacggcgtgc tgaagggega gatccacaag gccctgaagc tgaaggaegg cggccactac 12 0 0 

ctggtggagt tcaagtccat ctacatggcc aagaagcccg tgcagctgcc cggctactac 12 6 0 

tacgtggact ccaagctgga catcacctcc cacaacgagg actacaccat cgtggagcag 132 0 

tacgagegea ccgagggccg ccaccacctg ttcctgtagc ggccgcgact ctagatcata 13 80 

ateagecata ccacatttgt agaggtttta ettgetttaa aaaacctccc acacctcccc 1440 

ctgaacctga aacataaaat gaatgeaatt gttgttgtta acttgtttat tgcagcttat 15 0 0 

aatggttaca aataaagcaa tagcatcaca aatttcacaa ataaagcatt tttttcactg 15 6 0 

cattctagtt gtggtttgtc caaactcatc aatgtatctt aaggcgtaaa ttgtaagcgt 162 0 

taatattttg ttaaaattcg cgttaaattt ttgttaaatc agctcatttt ttaaccaata 1680 

ggecgaaate ggcaaaatcc cttataaatc aaaagaatag accgagatag ggttgagtgt 174 0 

tgttccagtt tggaacaaga gtceactatt aaagaacgtg gactccaacg teaaagggeg 18 0 0 

aaaaaccgtc tatcagggcg atggcccact aegtgaacca tcaccctaat caagtttttt 186 0 

ggggtcgagg tgccgtaaag cactaaatcg gaaccctaaa gggagccccc gatttagagc 192 0 

ttgacgggga aagceggega acgtggcgag aaaggaaggg aagaaagega aaggagcggg 1980 

cgctagggcg ctggcaagtg tageggtcac getgegegta accaccacac ccgccgcgct 2 04 0 

taatgcgccg etacagggeg cgtcaggtgg cacttttegg ggaaatgtgc gcggaacccc 2100 

tatttgttta tttttctaaa tacattcaaa tatgtatccg ctcatgagac aataaccctg 2160 

ataaatgett caataatatt gaaaaaggaa gagtcctgag gcggaaagaa ccagctgtgg 222 0 

aatgtgtgtc agttagggtg tggaaagtcc ccaggctccc cagcaggcag aagtatgcaa 22 8 0 

ageatgeate tcaattagtc agcaaccagg tgtggaaagt ccccaggctc cccagcaggc 234 0 

agaagtatgc aaagcatgea tctcaattag tcagcaacca tagtcccgcc cctaactccg 24 0 0 

cccatcccgc ccctaactcc gcccagttcc gcccattctc cgccccatgg ctgactaatt 2460 

ttttttattt atgeagagge cgaggccgcc tcggcctctg agctattcca gaagtagtga 2520 

ggaggctttt ttggaggcct aggcttttgc aaagatcgat caagagacag gatgaggatc 258 0 

gtttcgcatg attgaacaag atggattgca cgcaggttct ccggccgctt gggtggagag 2 64 0 

getattegge tatgactggg cacaacagac aatcggctgc tetgatgecg ccgtgttccg 2 700 

gctgtcagcg caggggcgcc cggttctttt tgtcaagacc gacctgtccg gtgccctgaa 2760 

tgaactgcaa gacgaggcag cgcggctatc gtggctggcc aegaegggeg ttccttgcgc 2 82 0 

agctgtgctc gacgttgtca ctgaagcggg aagggactgg ctgctattgg gcgaagtgcc 2 880 

ggggcaggat ctcctgtcat ctcaccttgc tcctgccgag aaagtatcca tcatggctga 2940 

tgeaatgegg eggctgeata cgcttgatcc ggctacctgc ccattcgacc accaagegaa 3 00 0 

acatcgcatc gagegagcac gtacteggat ggaagccggt cttgtcgatc aggatgatct 3 060 

ggacgaagag catcaggggc tcgcgccagc cgaactgttc gccaggctca aggegagcat 312 0 

gcccgacggc gaggatctcg tcgtgaccca tggegatgee tgettgeega atatcatggt 3180 

ggaaaatggc cgcttttctg gattcatcga ctgtggccgg ctgggtgtgg cggaccgcta 3 24 0 

tcaggacata gcgttggcta cccgtgatat tgctgaagag cttggcggcg aatgggctga 33 00 

ccgcttcctc gtgctttacg gtatcgccgc tcccgattcg cagcgcatcg ccttctatcg 3360 

ccttcttgac gagttcttct gagegggact ctggggttcg aaatgaccga ccaagcgacg 3420 

cccaacctgc catcacgaga tttcgattcc accgccgcct tctatgaaag gttgggcttc 3480 

ggaatcgttt tccgggacgc cggctggatg atcctccagc geggggatet catgetggag 3 540 

ttcttcgccc accctagggg gaggctaact gaaacacgga aggagacaat aceggaagga 360 O 

acccgcgcta tgacggcaat aaaaagacag aataaaaege acggtgttgg gtcgtttgtt 3 660 

cataaacgeg gggttcggtc ccagggctgg cactctgtcg ataccccacc gagaccccat 3 72 0 

tggggccaat acgcccgcgt ttcttccttt tccccacccc accccccaag ttcgggtgaa 3780 

ggcccagggc tcgcagccaa cgtcggggcg gcaggccctg ccatagcctc aggttactca 3 84 0 

tatatacttt agattgattt aaaacttcat ttttaattta aaaggatcta ggtgaagatc 3 90 0 

ctttttgata atctcatgac caaaatccct taacgtgagt tttcgttcca ctgagegtea 3 960 

gaccccgtag aaaagatcaa aggatcttct tgagatcctt tttttctgcg cgtaatctgc 4 02 0 

tgcttgcaaa caaaaaaacc accgctacca gcggtggttt gtttgccgga tcaagagcta 4 080 
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ccaactcttt ttccgaaggt aactggcttc agcagagcgc agataccaaa tactgtcctt 414 0 

ctagtgtagc cgtagttagg ccaecacttc aagaactctg tagcaccgcc tacatacctc 420 0 

gctctgctaa tcctgttacc agtggctgct gccagtggcg ataagtcgtg tcttaccggg 4260 

ttggactcaa gacgatagtt accggataag gcgcagcggt cgggctgaac ggggggttcg 432 0 

tgcacacagc ccagcttgga gcgaacgacc tacaccgaac tgagatacct acagcgtgag 43 8 0 

ctatgagaaa gcgccacgct tcccgaaggg agaaaggcgg acaggtatcc ggtaagcggc 4440 

agggtcggaa caggagagcg cacgagggag cttccagggg gaaacgcctg gtatctttat 4500 

agtcctgtcg ggtttcgcca cctctgactt gagcgtcgat ttttgtgatg ctcgtcaggg 4560 

gggcggagcc tatggaaaaa cgccagcaac gcggcctttt tacggttcct ggccttttgc 462 0 

tggccttttg ctcacatgtt ctttcctgcg ttatcccctg attctgtgga taaccgtatt 4680 

accgccatgc at 4692 



<210> 30 
<211> 4257 
<212> DNA 

<213> Artificial Sequence 
<220> 

<22 3> pPur plasmid from Clontech 



<400> 30 

ctgtggaatg tgtgtcagtt agggtgtgga aagtccccag gctccccagc aggcagaagt 60 

atgcaaagca tgcatctcaa ttagtcagca accaggtgtg gaaagtcccc aggctcccca 120 

gcaggcagaa gtatgcaaag catgcatctc aattagtcag caaccatagt cccgccccta 180 

actccgccca tcccgcccct aactccgccc agttccgccc attctccgcc ccatggctga 240 

ctaatttttt ttatttatgc agaggccgag gccgcctcgg cctctgagct attccagaag 3 00 

tagtgaggag gcttttttgg aggcctaggc ttttgcaaaa agcttgcatg cctgcaggtc 360 

ggccgccacg accggtgccg ccaccatccc ctgacccacg cccctgaccc ctcacaagga 42 0 

gacgaccttc catgaccgag tacaagccca cggtgcgcct cgccacccgc gacgacgtcc 480 

cccgggccgt acgcaccctc gccgccgcgt tcgccgacta ccccgccacg cgccacaccg 54 0 

tcgacccgga ccgccacatc gagcgggtca ccgagctgca agaactcttc ctcacgcgcg 600 

tcgggctcga catcggcaag gtgtgggtcg cggacgacgg cgccgcggtg gcggtctgga 660 

ccacgccgga gagcgtcgaa gcgggggcgg tgttcgccga gatcggcccg cgcatggccg 72 0 

agttgagcgg ttcccggctg gccgcgcagc aacagatgga aggcctcctg gcgccgcacc 78 0 

ggcccaagga gcccgcgtgg ttcctggcca ccgtcggcgt ctcgcccgac caccagggca 84 0 

agggtctggg cagcgccgtc gtgctccccg gagtggaggc ggccgagcgc gccggggtgc 900 

ccgccttcct ggagacctcc gcgccccgca acctcccctt ctacgagcgg ctcggcttca 96 0 

ccgtcaccgc cgacgtcgag gtgcccgaag gaccgcgcac ctggtgcatg acccgcaagc 102 0 

ccggtgcctg acgcccgccc cacgacccgc agcgcccgac cgaaaggagc gcacgacccc 10 80 

atggctccga ccgaagccga cccgggcggc cccgccgacc ccgcacccgc ccccgaggcc 114 0 

caccgactct agaggatcat aatcagccat accacatttg tagaggtttt acttgcttta 1200 

aaaaacctcc cacacctccc cctgaacctg aaacataaaa tgaatgcaat tgttgttgtt 12 60 

aacttgttta. ttgcagctta taatggttac aaataaagca atagcatcac aaatttcaca 1320 

aataaagcat. ttttttcact gcattctagt tgtggtttgt ccaaactcat caatgtatct 1380 

tatcatgtct ggatccccag gaagctcctc tgtgtcctca taaaccctaa cctcctctac 1440 

ttgagaggac attccaatca taggctgccc atccaccctc tgtgtcctcc tgttaattag 1500 

gtcacttaac aaaaaggaaa ttgggtaggg gtttttcaca gaccgctttc taagggtaat 1560 

tttaaaatat ctgggaagtc ccttccactg ctgtgttcca gaagtgttgg taaacagccc 1620 

acaaatgtca acagcagaaa catacaagct gtcagctttg cacaagggcc caacaccctg 168 0 

ctcatcaaga agcactgtgg ttgctgtgtt agtaatgtgc aaaacaggag gcacattttc 1740 

cccacctgtg taggttccaa aatatctagt gttttcattt ttacttggat caggaaccca 1800 

gcactccact ggataagcat tatccttatc caaaacagcc ttgtggtcag tgttcatctg 1860 

ctgactgtca actgtagcat tttttggggt tacagtttga gcaggatatt tggtcctgta 1920 

gtttgctaac acaccctgca gctccaaagg ttccccacca acagcaaaaa aatgaaaatt 19 8 0 

tgacccttga atgggttttc cagcaccatt ttcatgagtt ttttgtgtcc ctgaatgcaa 2040 

gtttaacata gcagttaccc caataacctc agttttaaca gtaacagctt cccacatcaa 2100 

aatatttcca caggttaagt cctcatttaa attaggcaaa ggaattcttg aagacgaaag 2160 

ggcctcgtga tacgcctatt tttataggtt aatgtcatga taataatggt ttcttagacg 222 0 

tcaggtggca cttttcgggg aaatgtgcgc ggaaccccta tttgtttatt tttctaaata 22 80 

cattcaaata tgtatccgct catgagacaa taaccctgat aaatgcttca ataatattga 234 0 

aaaaggaaga gtatgagtat tcaacatttc cgtgtcgccc ttattccctt ttttgcggca 2400 

ttttgccttc ctgtttttgc tcacccagaa acgctggtga aagtaaaaga tgctgaagat 2460 

cagttgggtg cacgagtggg ttacatcgaa ctggatctca acagcggtaa gatccttgag 252 0 

agttttcgcc ccgaagaacg ttttccaatg atgagcactt ttaaagttct gctatgtggc 25 80 

gcggtattat cccgtgttga cgccgggcaa gagcaactcg gtcgccgcat acactattct 2640 

cagaatgact tggttgagta ctcaccagtc acagaaaagc atcttacgga tggcatgaca 2700 

gtaagagaat tatgcagtgc tgccataacc atgagtgata acactgcggc caacttactt 2760 
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ctgacaacga tcggaggacc gaaggagcta accgcttttt tgcacaacat gggggatcat 2 82 0 

gtaactcgcc ttgatcgttg ggaaccggag ctgaatgaag ccataccaaa cgacgagcgt 2 88 0 

gacaccacga tgcctgcagc aatggcaaca acgttgcgca aactattaac tggcgaacta 2 94 0 

cttactctag cttcccggca acaattaata gactggatgg aggcggataa agttgcagga 3 0O0 

ccacttctgc gctcggccct tccggctggc tggtttattg ctgataaatc tggagccggt 3 06 0 

gagcgtgggt ctcgcggtat cattgcagca ctggggccag atggtaagcc ctcccgtatc 312 0 

gtagttatct acacgacggg gagtcaggca actatggatg aacgaaatag acagatcgct 318 0 

gagataggtg cctcactgat taagcattgg taactgtcag accaagttta ctcatatata 324 0 

ctttagattg atttaaaact tcatttttaa tttaaaagga tctaggtgaa gatccttttt 330 0 

gataatctca tgaccaaaat cccttaacgt gagttttcgt tccactgagc gtcagacccc 3360 

gtagaaaaga tcaaaggatc ttcttgagat cctttttttc tgcgcgtaat ctgctgcttg 342 0 

caaacaaaaa aaccaccgct accagcggtg gtttgtttgc cggatcaaga gctaccaact 348 0 

ctttttccga aggtaactgg cttcagcaga gcgcagatac caaatactgt ccttctagtg 3 54 0 

tagccgtagt taggccacca cttcaagaac tctgtagcac cgcctacata cctcgctctg 3 60 0 

ctaatcctgt taccagtggc tgctgccagt ggcgataagt cgtgtcttac cgggttggac 3 660 

tcaagacgat agttaccgga taaggcgcag cggtcgggct gaacgggggg ttcgtgcaca 3 72 0 

cagcccagct tggagcgaac gacctacacc gaactgagat acctacagcg tgagctatga 3 78 0 

gaaagcgcca cgcttcccga agggagaaag gcggacaggt atccggtaag cggcagggtc 3 84 0 

ggaacaggag agcgcacgag ggagcttcca gggggaaacg cctggtatct ttatagtcct 3 900 

gtcgggtttc gccacctctg acttgagcgt cgatttttgt gatgctcgtc aggggggcgg 3960 

agcctatgga aaaacgccag caacgcggcc tttttacggt tcctggcctt ttgctggcct 4 02 0 

tttgctcaca tgttctttcc tgcgttatcc cctgattctg tggataaccg tattaccgcc 4080 

tttgagtgag ctgataccgc tcgccgcagc cgaacgaccg agcgcagcga gtcagtgagc 414 0 

gaggaagcgg aagagcgcct gatgcggtat tttctcctta cgcatctgtg cggtatttca 420 0 

caccgcatat ggtgcactct cagtacaatc tgctctgatg ccgcatagtt aagccag 4257 



<210> 31 
<211> 8136 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pWE15 cosmid vector 
<300> 

<3 0 8> GenBank X652 7 9 
<309> 1995-04-14 



<400> 31 

ctatagtgag tcgtattatg cggccgcgaa ttcttgaaga cgaaagggcc tcgtgatacg 6 0 

cctattttta taggttaatg tcatgataat aatggtttct tagacgtcag gtggcacttt 12 0 

tcggggaaat gtgcgcggaa cccctatttg tttatttttc taaatacatt caaatatgta 18 0 

tccgctcatg agacaataac cctgataaat gcttcaataa tattgaaaaa ggaagagtat 24 0 

gagtattcaa catttccgtg tcgcccttat tccctttttt gcggcatttt gcttcctgtt 30 0 

tttgctcacc cagaaacgct ggtgaaagta aaagatgctg aagatcagtt gggtgcacga 36 0 

gtgggttaca tcgaactgga tctcaacagc ggtaagatcc ttgagagttt tcgccccgaa 42 0 

gaacgttttc caatgatgag cacttttaaa gttctgctat gtggcgcggt attatcccgt 48 0 

gttgacgccg ggcaagagca actcggtcgc cgcatacact attctcagaa tgacttggtt 54 0 

gagtactcac cagtcacaga aaagcatctt acggatggca tgacagtaag agaattatgc 60 0 

agtgctgcca taaccatgag tgataacact gcggccaact tacttctgac aacgatcgga 66 0 

ggaccgaagg agctaaccgc ttttttgcac aacatggggg atcatgtaac tcgccttgat 72 0 

cgttgggaac cggagctgaa tgaagccata ccaaacgacg agcgtgacac cacgatgcct 780 

gcagcaatgg caacaacgtt gcgcaaacta ttaactggcg aactacttac tctagcttcc 84 0 

cggcaacaat taatagactg gatggaggcg gataaagttg caggaccact tctgcgctcg 900 

gcccttccgg ctggctggtt tattgctgat aaatctggag ccggtgagcg tgggtctcgc 96 0 

ggtatcattg cagcactggg gccagatggt aagccctccc gtatcgtagt tatctacacg 102 0 

acggggagtc aggcaactat ggatgaacga aatagacaga tcgctgagat aggtgcctca 108 0 

ctgattaagc attggtaact gtcagaccaa gtttactcat atatacttta gattgattta 114 0 

aaacttcatt tttaatttaa aaggatctag gtgaagatcc tttttgataa tctcatgacc 120 0 

aaaatccctt aacgtgagtt ttcgttccac tgagcgtcag accccgtaga aaagatcaaa 126 0 

ggatcttctt gagatccttt ttttctgcgc gtaatctget gcttgcaaac aaaaaaacca 132 0 

ccgctaccag cggtggtttg tttgccggat caagagctac caactctttt tccgaaggta 138 0 

actggcttca gcagagcgca gataccaaat actgtccttc tagtgtagcc gtagttaggc 144 0 

caccacttca agaactctgt agcaccgcct acatacctcg ctctgctaat cctgttacca 150 0 

gtggctgctg ccagtggcga taagtcgtgt cttaccgggt tggactcaag acgatagtta 156 0 

ccggataagg cgcagcggtc gggctgaacg gggggttcgt gcacacagcc cagcttggag 162 0 

cgaacgacct acaccgaact gagataccta cagcgtgagc tatgagaaag cgccacgctt 168 0 
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ccgaagggag 
cgagggagct 
ctctgacttg 
gccagcaacg 
tfctcctgcgt 
accgctcgcc 
cgctgacttc 
tgctcaggtc 
ttcattctgc 
cacgatcatg 
accacaacta 
ttatttgtaa 
atgtttcagg 
fcgtggtatgg 
cccctttaca 
acactctatg 
tacttataaa 
cctttgtggt 
aaaaacttag 
gaggagtaga 
agcaaaacag 
aggttggaat 
aaaattttat 
caccacagaa 
actcctgcag 
cggatttgca 
actcgcgagg 
gaggatcatc 

ggcggtggaa 

ccccagagtc 
tcgggagcgg 
tcagcaatat 
ccacagtcga 
tcgccatggg 
gttcggctgg 
cttccatccg 
tagccggatc 
caggagcaag 
cccttcccgc 
gccacgatag 
tgacaaaaag 
cgattgtctg 
ctgcgtgcaa 
atcttgatcc 

a ggg c ttccc 

ataaaaccgc 
ttgcgcttgc 
accgtttctg 
agtgcttgcg 
actacttctg 
93ggcggaga 
ggactatggt 
ctggggactt 
ctgctgggga 
gcaggaccca 
ggatatgttc 
tccaattctt 
gtggcccggc 
cctacaatcc 
tcagcggtcc 
cctgatggtc 
gccggaagcg 
caagacgtag 
aacgtttggt 
ccgcaagcga 
cccagagcgc 
cggcgacgat 



aaaggcggac 
tccaggggga 
agcgtcgatt 
cggccttttt 
tatcccctga 
gcagccgaac 
cgcgtttcca 
gcagacgttt 
taaccagtaa 
cgcacccgfcc 
gaatgcagtg 
ccattataag 
ttcaggggga 
ctgattatga 
aattaaaaag 
ccfcgtgtgga 
ggttacagaa 
gtaaatagca 
caattctgaa 
atgttgagag 
gttttcctca 
ctaaaataca 
atttacctta 
gtaaggttcc 
ttcgggggca 
ctgccggtag 
ggatcgagcc 
cagccggcgt 
tcgaaatctc 
ccgctcagaa 
cgataccgta 
cacgggtagc 
tgaatccaga 
tcacgacgag 
cgcgagcccc 
agtacgtgct 
aagcgtatgc 
gtgagatgac 
ttcagtgaca 
ccgcgctgcc 
aaccgggcgc 
ttgtgcccag 
tccatcttgt 
cctgcgccat 
aaccttacca 
ccagtctagc 
gttttccctt 
cggactggct 
gcagcgtgaa 
gaatagctca 
atgggcggaa 
tgctgacfcaa 
tccacacctg 
gcctggggac 
acgctgcccg 
tgccaagggt 
ggagfcggtiga 
tccatgcacc 
atgccaaccc 
aatgatcgaa 
gtcatctacc 
agaagaatca 
cccagcgcgt 
ggcggga c c a 
caggccgatc 
tgccggcacc 
agtcatgccc 



aggtatccgg 
aacgcctggt 
fcttgtgafcgc 
acggttcctg 
tfcctgtggat 
gaccgagcgc 
gactttacga 
tgcagcagca 
ggcaaccccg 
agatccagac 
aaaaaaatgc 
ctgcaataaa 
ggtgtgggag 
tctctagtca 
ctaaaggfcac 
g t aagaaaaa 
tatttttcca 
aagcaagcaa 
ggaaagtcct 
tcagcagtag 
ttaaaggcat 
caaacaatta 
gagctttaaa 
ttcacaaaga 
tggatgcgcg 
aactcgcgag 
cggggtgggc 
cccggaaaac 
gtgatggcag 
gaactcgtca 
aagcacgagg 
caacgctatg 
aaagcggcca 
atcctcgccg 
tgatgctctt 
cgctcgatgc 
agccgccgca 
aggagatcct 
acgtcgagca 
tcgtcctgca 
ccctgcgctg 
tcatagccga 
tcaatcatgc 
cagatcctfcg 
gagggcgccc 
tatcgccatg 
gtccagatag 
ttctacgtgt 
agctttttgc 
gaggccgagg 
ctgggcggag 
ttgagatgca 
gttgctgact 
tttccacacc 
agatgcgccg 
fcggtttgcgc 
atccgttagc 
gcgacgcaac 
gttccatgtg 
gttaggctgg 
fcgccfcggaca 
taatggggaa 
cgggccgcca 
gtgacgaagg 
atcgtcgcgc 
tgtcctacga 
cgcgcccacc 



taagcggcag 
atctttatag 
tcgtcagggg 
gccttttgct 
aaccgtatta 
agcgagtcag 
aacacggaaa 
gtcgcttcac 
ccagcctagc 
atgataagat 
tttatttgtg 
caagttaaca 
gttttttaaa 
aggcactata 
acaatttttg 
acagtatgtt 
taattttctt 
gagttctatt 

tggggtcttc 

cctcatcatc 
tccaccactg 
gaatcagtag 
tctctgtagg 
tccggaccaa 
gatagccgct 
gtcgtccagc 
gaagaactcc 
gattccgaag 
gttgggcgtc 
agaaggcgat 
aagcggtcag 
tcctgatagc 
ttttccacca 
fccgggafcgcg 
cgtccagatc 
gatgtttcgc 
ttgcatcagc 
gccccggcac 
cagctgcgca 
gttcattcag 
acagccggaa 
atagcctctc 
gaaacgatcc 
gcggcaagaa 
cagctggcaa 
taagcccact 
cccagtagct 
tccgcttcct 
aaaagcctag 
cggcctaaat 
ttaggggcgg 
tgctttgcat 
aattgagatg 
ctaactgaca 
cgtgcggctg 
attcacagtt 
gaggtgccgc 
gcggggaggc 
ctcgccgagg 
taagagccgc 
gcatggcctg 
ggccatccag 
fcgccggcgat 
cttgagcgag 
tccagcgaaa 
gttgcatgat 
ggaaggagct 



ggtcggaaca 
tcctgtcggg 
ggcggagcct 
ggccttttgc 
ccgcctttga 
tgagcgagga 
ccgaagacca 
gttcgctcgc 
cgggtcctca 
acattgatga 
aaatttgtga 
acaacaattg 
gcaagtaaaa 
catcaaafcat 
agcatagtta 
atgattataa 
gtatagcagt 
actaaacaca 
tacctttctc 
actagatggc 
ctcccattca 
tttaacacat 
tagtttgtcc 
agcggccatc 
gctggtttcc 
ctcaggcagc 
agcatgagat 
cccaaccttt 
gcttggtcgg 
agaaggcgat 
cccattcgcc 
ggtccgccac 
tgatattcgg 
cgccttgagc 
atcctgatcg 
ttggtggfccg 
catgatggat 
ttcgcccaat 
aggaacgccc 
ggcaccggac 
cacggcggca 
cacccaagcg 
tcafcccfcgfcc 
agccatccag 
ttccggttcg 
gcaagctacc 
gacattcatc 
ttagcagccc 
gcctccaaaa 
aaaaaaaatt 
gatgggcgga 
acttctgcct 
catgctttgc 
cacattccac 
ctggagatgg 
ctccgcaaga 
cggcttccat 
agacaaggta 
cgcataaatc 
gagcgatcct 
caacgcggca 
cctcgcgtcg 
aatggcctgc 
ggcgtgcaag 
gcggtcctcg 
aaagaagaca 
gactgggttg 



ggagagcgca 
gtttcgccac 
atggaaaaac 
tcacatgttc 
gtgagctgat 
agcggaagag 
ttcatgttgt 
gtatcggtga 
acgacaggag 
g 1 1 1 ggac aa 
tgctattgct 
cattcatttt 
cctctacaaa 
tccttattaa 
ttaafcagcag 
ctgttatgcc 
gcagcttttt 
gcatgactca 
ttcttttttg 
atttcttctg 
tcagttccat 
tatacactta 
aattatgtca 
gtgcctcccc 
tggatgccga 
agctgaacca 
ccccgcgctg 
catagaaggc 
tcatttcgaa 
gcgctgcgaa 
gccaagctct 
acccagccgg 
caagcaggca 
ctggcgaaca 
acaagaccgg 
aatgggcagg 
actttctcgg 
agcagccagt 
gtcgtggcca 
aggtcggtct 
tcagagcagc 
gccggagaac 
tcttgatcag 
tttactttgc 
cttgctgtcc 
tgctttctct 
cggggtcagc 
ttgcgccctg 
aagcctcctc 
agtcagccat 
gttaggggcg 
gctggggagc 
atacttctgc 
agccggatct 
cggacgcgat 
attgattggc 
tcaggtcgag 
tagggcggcg 
gccgtgacga 
tgaagctgtc 
tcccgatgcc 
cgaacgccag 
ttctcgccga 
attccgaata 
ccgaaaatga 
gtcataagtg 
aaggctctca 



1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 
5100 
5160 
5220 
5280 
5340 
5400 
5460 
5520 
5580 
5640 
5700 
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agggca t egg 
ggttgaggcc 
acagtccccc 
gaagtggcga 
acctgtggcg 
ageatgegea 
tacccacatc 
ggfcaatagtc 
aaacaaaaga 
gtcaatgege 
aactgttccc 
agtcaggtaa 
cacgcgcaca 
ggctttgctg 
cacgttgtcc 
gtctggtgat 
aaccagacac 
agggcagaaa 
cgcttcgatg 
gctggaccag 
catcaaggac 
gccggtgacc 
caaggtaacc 
aacgttgatc 
gcaaatggca 
catgatggaa 
tgataattat 
etcgegggtt 
tcataactta 
aaagegaget 
tggaagt caa 
ac tggcagga 
atgactctgc 
agctgcgccg 
gtggtcgcca 
eggeggcaaa 
catatagege 
geaagaggee 
tgecgaggat 
aatttaactg 
aattcgegge 



tcgacgctct 
gttgagcacc 
ggccacgggc 
gcccgatctt 
ccggtgatgc 
tatccatget 
gtcatcgett 
catgaaaatc 
gatggfcgatc 
tggatatggg 
aactaaaatc 
tgaatcctga 
ccgtagaaag 
tgegacagge 
ggegeggega 
ctgccttcta 
acagcaactg 
tttgccgttg 
acgcttggcg 
cgcattcgtg 
gccgctatcg 
aatatctaca 
gtcagtgccg 
gaaaacgege 
gcagacaaga 
tgtttccccg 
tatcatttgc 
ttcgctattt 
atgtttfctat 
ttttggcctc 
caaaaagcag 
acagggaatg 
cgccgtcata 
ggaggtfcgaa 
tgatcgcgta 
geggteggae 
tagcagcacg 
cggcagtacc 
gacgatgagc 
tgataaacta 
cgcaattaac 



cccttatgcg 
gccgccgcaa 
ctgccaccat 
ccccatcggt 
cggccacgat 
tcgaccatgc 
tccactgctc 
cttgtattca 
tttctaagag 
atagatggga 
attttgeacg 
tataaagaca 
tctttcagtt 
teaegtctaa 
cggatgttct 
aatctggcac 
aataccagaa 
aacacctggt 
ttgagattga 
acaccgtctc 
caaatggtgc 
acatcagcct 
ataagttcaa 
tgaaaaaege 
aagcgatgga 
gtggtgttat 
gggtcctttc 
atgaaaattt 
ttaaaatacc 
tgtcgtttcc 
ctggctgaca 
cccgttctgc 
aaatggtatg 
gaactgegge 
gtcgatagtg 
agtgctccga 
ccatagtgac 
ggcataacca 
gcattgttag 
ccgcattaaa 
cctcactaaa 



actcctgcat 
ggaatggtgc 
acccacgccg 
gatgtcggcg 
gcgtccggcg 
gctcacaaag 
tegegaataa 
taaatcctcc 
atgatggaat 
atatgetgat 
atcagcgcac 
ggttgataaa 
gtgagcctgg 
aaggaaataa 
gtatgcgctg 
agecgaattg 
agaaaatcac 
caatacgegt 
tacctctgct 
cttcgaactt 
fcatccacgca 
tggtatccag 
agttaaacct 
tgctgaatgfc 
tgaactggct 
ctggcagcag 
cggcgatccg 
tccggtttaa 
ctctgaaaag 
tttctctgtt 
ttttcggtgc 
gaggcggtgg 
ccgaaaggga 
aggecagega 
gctccaagta 
gaacgggtgc 
tggegatget 
agcctatgcc 
atttcataca 
gcttatcgat 
ggatcc 



taggaagcag 
atgeaaggag 
aaacaagege 
atataggege 
tagaggatct 
taggtgaatg 
agatggaaaa 
aggtagctat 
ctcccttcag 
ttttatggga 
tacgaacttt 
tcagtcttct 
geaaacegtt 
atcatgggtc 
tttttccgtg 
cgcgagcttg 
tttacctttc 
tttggtgagc 
gcacaaaagg 
attegcaatg 
geggcaateg 
cgtgatgagc 
ggtgttgata 
gcggcgctgg 
tcctatgtcc 
tgeegtcgat 
ccttgttacg 
ggcgtttccg 
aaaggaaacg 
tttgtccgtg 
gagtatccgt 
caagggt aa t 
tgctgaaatt 
ggcagatcca 
gegaagegag 
gcatagaaat 
gtcggaatgg 
tacagcatcc 
cggtgcctga 
gataageggt 



cccagtagta 
atggcgccca 
tcatgagccc 
cagcaaccgc 
tggcagtcac 
cgcaatgtag 
tcaatctcat 
atgcaaattg 
tatcccgatg 
cagagttgcg 
acccacaaat 
acgcgcatcg 
aactttegge 
ataaaattat 
gcgcgttgct 
gttttgctga 
tgacatcaga 
agcaatattg 
caatcgacga 
gagtgtcatt 
aaacacctca 
cagegcagaa 
ccaacattga 
atgtcacaaa 
gcacggccat 
agtatgcaat 
gggeggegae 
ttcttcttcg 
acaggtgctg 
gaatgaacaa 
accattcaga 
gaggtgcttt 
gagaacgaaa 
caggaegggt 
caggactggg 
tgcatcaacg 
acgatatccc 
agggtgaegg 
ctgcgttagc 
caaacatgag 



<210> 32 

<211> 2713 

<212> DNA 

<z2Z3> Artificial Sequence 
<220> 

<:223> pNEB193 plasmid 



<400> 32 

tcgcgcgttt 

cagcttgtct 

ttggcgggtg 

accatatgeg 

attcgecatt 

tacgccagct 

tttcccagtc 

gegceggat c 

gegtaatcat 

aacatacgag 

acattaattg 

cattaatgaa 

tcctcgctca 

teaaaggegg 

geaaaaggee 

aggctccgcc 



eggtgatgae 
gtaageggat 
teggggctgg 
gtgtgaaata 
caggc t gege 
ggcgaaaggg 
acgacgttgt 
cttaattaag 
ggtcatagct 
ceggaagcat 
cgttgcgctc 
tcggccaacg 
ctgactcgct 
taatacggtt 
agcaaaaggc 
cccctgacga 



ggtgaaaacc 
geegggagea 
cttaactatg 
ccgcacagat 
aactgttggg 
ggatgtgctg 
aaaacgaegg 
tctagagtcg 
gtttcctgtg 
aaagtgtaaa 
actgcccgct 
cgeggggaga 
gcgctcggtc 
atccacagaa 
caggaaccgt 
gcatcacaaa 



tctgacacat 
gacaagcccg 
eggcatcaga 
gegtaaggag 
aagggegate 
caaggegatt 
ccagtgaatt 
actgtttaaa 
tgaaattgtt 
gcctggggtg 
ttccagtegg 
ggcggtttgc 
gttcggctgc 
tcaggggata 
aaaaaggccg 
aatcgacget 



gcagctcccg 
teagggegeg 
gcagattgta 
aaaatacege 
ggtgcgggcc 
aagttgggta 
egagcteggt 
cctgcaggca 
atccgctcac 
cctaatgagt 
gaaacctgtc 
gtattgggcg 
ggcgagcggt 
aegcaggaaa 
cgttgctggc 
caagtcagag 



gagaeggtea 
teagegggtg 
ctgagagtgc 
atcaggcgcc 
tettegctat 
aegecagggt 
acccgggggc 
tgcaagcttg 
aattccacac 
gagctaactc 
gtgccagctg 
ctcttccgct 
atcagctcac 
gaacatgtga 
gtttttccat 
gtggcgaaac 



5760 
5820 
5880 
5940 
6000 
6060 
6X20 
6180 
6240 
6300 
6360 
6420 
6480 
6540 
6600 
6660 
6720 
6780 
6840 
6900 
6960 
7 02 0 
7080 
7140 
7200 
7260 
7320 
7380 
7440 
7500 
7560 
7620 
7680 
7740 
7800 
7860 
7920 
7980 
8040 
8100 
8136 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
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ccgacaggac tataaagata ccaggcgttt ccccctggaa gctccctcgt gcgctctcct 1020 

gttccgaccc tgccgcttac cggatacctg tccgcctttc tcccttcggg aagcgtggcg 108 0 

ctttctcata gctcacgctg taggtatctc agttcggtgt aggtcgttcg ctccaagctg 114 0 

ggctgtgtgc acgaaccccc cgttcagccc gaccgctgcg ccttatccgg taactatcgt 1200 

cttgagtcca acccggtaag acacgactta tcgccactgg cagcagccac tggtaacagg 12 60 

attagcagag cgaggtatgt aggcggtgct acagagttct tgaagtggtg gcctaactac 132 0 

ggctacacta gaaggacagt atttggtatc tgcgctctgc tgaagccagt taccttcgga 13 80 

aaaagagttg gtagctcttg atccggcaaa caaaccaccg ctggtagcgg tggttttttt 144 0 

gtttgcaagc agcagattac gcgcagaaaa aaaggatctc aagaagatcc tttgatcttt 150 0 

tctacggggt ctgacgctca gtggaacgaa aactcacgtt aagggatttt ggtcatgaga 1560 

ttatcaaaaa ggatcttcac ctagatcctt ttaaattaaa aatgaagttt taaatcaatc 162 0 

taaagtatat atgagtaaac ttggtctgac agttaccaat gcttaatcag tgaggcacct 1680 

atctcagcga tctgtctatt tcgttcatcc atagttgcct gactccccgt cgtgtagata 1740 

actacgatac gggagggctt accatctggc cccagtgctg caatgatacc gcgagaccca 180 0 

cgctcaccgg ctccagattt atcagcaata aaccagccag ccggaagggc cgagcgcaga 1860 

agtggtcctg caactttatc cgcctccatc cagtctatta attgttgccg ggaagctaga 192 0 

gtaagtagtt cgccagttaa tagtttgcgc aacgttgttg ccattgctac aggcatcgtg 1980 

gtgtcacgct cgtcgtttgg tatggcttca ttcagctccg gttcccaacg atcaaggcga 2040 

gttacatgat cccccatgtt gtgcaaaaaa gcggttagct ccttcggtcc tccgatcgtt 210 0 

gtcagaagta agttggccgc agtgttatca ctcatggtta tggcagcact gcataattct 2160 

cttactgtca tgccatccgt aagatgcttt tctgtgactg gtgagtactc aaccaagtca 2220 

ttctgagaat agtgtatgcg gcgaccgagt tgctcttgcc cggcgtcaat acgggataat 22 80 

accgcgccac atagcagaac tttaaaagtg ctcatcattg gaaaacgttc ttcggggcga 234 0 

aaactctcaa ggatcttacc gctgttgaga tccagttcga tgtaacccac tcgtgcaccc 24 00 

aactgatctt cagcatcttt tactttcacc agcgtttctg ggtgagcaaa aacaggaagg 2460 

caaaatgccg caaaaaaggg aataagggcg acacggaaat gttgaatact catactcttc 252 0 

ctttttcaat attattgaag catttatcag ggttattgtc tcatgagcgg atacatattt 2580 

gaatgtattt agaaaaataa acaaataggg gttccgcgca catttccccg aaaagtgcca 2640 

cctgacgtct aagaaaccat tattatcatg acattaacct ataaaaatag gcgtatcacg 270 0 

aggccctttc gtc 2713 

<210> 33 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attP 
<400> 33 

cagctttttt atactaagtt g 21 

<210> 34 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attB 
<400> 34 

ctgctttttt atactaactt g 21 

<210> 35 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attL 
<400> 35 

ctgctttttt atactaagtt g 21 



<210> 36 
<211> 21 
<212> DNA 
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<213> Artificial Sequence 
<220> 

<2 23> attR 
<400> 36 

cagctttttt atactaactt g 21 

<210> 37 
<211> 1071 
<212> DNA 

<213> Artificial Sequence 
<220> 

<2 23> Integrase B174R 

<221> CDS 

<222> (1) . . . (1071) 

<2 23> Nucleotide sequence encoding Integrase E147R 
<400> 37 

atg gga aga agg cga agt cat gag cgc egg gat tta ccc cct aac ctt 4 8 
Met Gly Arg Arg Arg Ser His Gin Arg Arg Asp Leu Pro Pro Asn Leu 
15 10 15 

tat at a aga aac aat gga tat tac tgc tac agg gac cca agg acg ggt 96 
Tyr lie Arg Asn Asn Gly Tyr Tyr Cys Tyr Arg Asp Pro Arg Thr Gly 
20 25 30 

aaa gag ttt gga tta ggc aga gac agg cga ate gca ate act gaa get 144 
Lys Glu Phe Gly Leu Gly Arg Asp Arg Arg He Ala He Thr Glu Ala 
35 40 45 

at a cag gec aac att gag tta ttt tea gga ca'c aaa cac aag cct ctg 192 
lie Gin Ala Asn lie Glu Leu Phe Ser Gly His Lys His Lys Pro Leu 
50 55 60 

aca gcg aga ate aac agt gat aat tec gtt acg tta cat tea tgg ctt 240 
Thr Ala Arg lie Asn Ser Asp Asn Ser Val Thr Leu His Ser Trp Leu 
65 70 75 80 

gat cgc tac gaa aaa ate ctg gee age aga gga ate aag cag aag aca 288 
Asp Arg Tyr Glu Lys lie Leu Ala Ser Arg Gly lie Lys Gin Lys Thr 
85 90 95 

etc at a aat tac atg age aaa att aaa gca ata agg agg ggt ctg cct 336 
Leu lie Asn Tyr Met Ser Lys lie Lys Ala lie Arg Arg Gly Leu Pro 
100 105 110 

gat get cca ctt gaa gac ate acc aca aaa gaa att gcg gca atg etc 3 84 
Asp Ala Pro Leu Glu Asp lie Thr Thr Lys Glu lie Ala Ala Met Leu 
115 120 125 

aat gga tac ata gac gag ggc aag gcg gcg tea gee aag tta ate aga 432 
Asn Gly Tyr lie Asp Glu Gly Lys Ala Ala Ser Ala Lys Leu lie Arg 
130 135 140 

tea aca ctg age gat gca ttc cga gag gca ata get gaa ggc cat ata 4 80 
Ser Thr Leu Ser Asp Ala Phe Arg Glu Ala lie Ala Glu Gly His He 
145 150 155 160 

aca aca aac cat gtc get gee act cgc gca gca aaa tct aga gta agg 52 8 
Thr Thr Asn His Val Ala Ala Thr Arg Ala Ala Lys Ser Arg Val Arg 
165 170 175 

aga tea aga ctt acg get gac gaa tac ctg aaa att tat caa gca gca 576 
Arg Ser Arg Leu Thr Ala Asp Glu Tyr Leu Lys lie Tyr Gin Ala Ala 
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180 185 190 

gaa tea tea cca tgt tgg etc aga ctt gca atg gaa ctg get gtt gtt 624 

Glu Ser Ser Pro Cys Trp Leu Arg Leu Ala Met Glu Leu Ala Val Val 
195 200 205 

ace ggg caa cga gtt ggt gat tta tgc gaa atg aag tgg tct gat ate 672 

Thr Gly Gin Arg Val Gly Asp Leu Cys Glu Met Lys Trp Ser Asp lie 
210 215 220 

gta gat gga tat ctt tat gtc gag caa age aaa aca ggc gta aaa att 72 0 
Val Asp Gly Tyr Leu Tyr Val Glu Gin Ser Lys Thr Gly Val Lys lie 

225 230 235 240 

gee ate cca aca gca ttg cat att gat get etc gga ata tea atg aag 768 

Ala He Pro Thr Ala Leu His lie Asp Ala Leu Gly He Ser Met Lys 
245 250 255 

gaa aca ctt gat aaa tgc aaa gag att ctt ggc gga gaa acc ata att 816 

Glu Thr Leu Asp Lys Cys Lys Glu lie Leu Gly Gly Glu Thr lie lie 
260 265 270 

gca tct act cgt cgc gaa ccg ctt tea tec ggc aca gta tea agg tat 864 

Ala Ser Thr Arg Arg Glu Pro Leu Ser Ser Gly Thr Val Ser Arg Tyr 
275 280 285 

ttt atg cgc gca cga aaa gca tea ggt ctt tec ttc gaa ggg gat ccg 912 

Phe Met Arg Ala Arg Lys Ala Ser Gly Leu Ser Phe Glu Gly Asp Pro 
290 295 300 

cct acc ttt cac gag ttg cgc agt ttg tct gca aga etc tat gag aag 960 

Pro Thr Phe His Glu Leu Arg Ser Leu Ser Ala Arg Leu Tyr Glu Lys 

305 310 315 320 

cag ata age gat aag ttt get caa cat ctt etc ggg cat aag teg gac 10 08 

Gin lie Ser Asp Lys Phe Ala Gin His Leu Leu Gly His Lys Ser Asp 
325 330 335 

acc atg gca tea cag tat cgt gat gac aga ggc agg gag tgg gac aaa 105 6 

Thr Met Ala Ser Gin Tyr Arg Asp Asp Arg Gly Arg Glu Trp Asp Lys 
340 345 350 

att gaa ate aaa taa 1071 
He Glu lie Lys * 
355 



<210> 38 
























<211> 356 
























<212> PRT 
























<213> Artificial Sequence 




















<220> 
























<223> Integrase 


E147R 






















<40O> 38 
























Met Gly Arg Arg 


Arg Ser 


His 


Glu 


Arg 


Arg 




Leu 


Pro 


Pro 


Asn 


Leu 


1 


5 








10 










15 




Tyr He Arg Asn 


Asn Gly 


Tyr 


Tyr 


Cys 


Tyr 


Arg 


Asp 


Pro 


Arg 


Thr 


Gly 


20 








25 










30 






Lys Glu Phe Gly 


Leu Gly 


Arg 


Asp 


Arg 


Arg 


He 


Ala 


He 


Thr 


Glu 


Ala 


35 






40 










45 








He Gin Ala Asn 


He Glu 


Leu 


Phe 


Ser 


Gly 


His 


Lys 


His 


Lys 


Pro 


Leu 


50 




55 










60 










Thr Ala Arg He 


Asn Ser 


Asp 


Asn 


Ser 


Val 


Thr 


Leu 


His 


Ser 


Trp 


Leu 


65 


70 










75 










80 


Asp Arg Tyr Glu 


Lys He 


Leu 


Ala 


Ser 


Arg 


Gly 


He 


Lys 


Gin 


Lys 


Thr 
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Leu 


lie 


Asn 


Tyr 








100 


Asp 


Ala 


Pro 


Leu 






115 




Asn 


Gly 


Tyr 


lie 




130 






Ser 


Thr 


Leu 


Ser 


145 








Till: 


Thr 


Asn 


His 


Arg 


Ser 


Arg 


Leu 








180 


Glu 


Ser 


Ser 


Pro 






195 




Thr 


Gly 


Gin 


Arg 




210 






Val 


Asp 


Gly 


Tyr 


225 








Ala 


lie 


Pro 


Thr 


Glu 


Thr 


Leu 


Asp 








260 


Ala 


Ser 


Thr 


Arg 






275 




Phe 


Met 


Arg 


Ala 




290 






Pro 


Thr 


Phe 


His 


305 








Gin 


lie 


Ser 


Asp 


Thr 


Met 


Ala 


Ser 








340 


lie 


Glu 


lie 


Lys 



355 



85 

Met Ser Lys lie 

Glu Asp lie Thr 
120 

Asp Glu Gly Lys 
135 

Asp Ala Phe Arg 
150 

Val Ala Ala Thr 
165 

Thr Ala Asp Glu 

Cys Trp Leu Arg 
2 00 

Val Gly Asp Leu 
215 

Leu Tyr Val Glu 
230 

Ala Leu His He 
245 

Lys Cys Lys Glu 

Arg Glu Pro Leu 
280 

Arg Lys Ala Ser 
295 

Glu Leu Arg Ser 
310 

Lys Phe Ala Gin 
325 

Gin Tyr Arg Asp 



-23- 



90 

Lys Ala He Arg 
105 

Thr Lys Glu lie 

Ala Ala Ser Ala 
140 

Glu Ala lie Ala 
155 

Arg Ala Ala Lys 
170 

Tyr Leu Lys lie 
185 

Leu Ala Met Glu 

Cys Glu Met Lys 
220 

Gin Ser Lys Thr 
235 

Asp Ala Leu Gly 
250 

lie Leu Gly Gly 
265 

Ser Ser Gly Thr 

Gly Leu Ser Phe 
300 

Leu Ser Ala Arg 
315 

His Leu Leu Gly 
330 

Asp Arg Gly Arg 
345 







95 




Arg 


Gly 


Leu 


Pro 




110 






Ala 


Ala 


Met 


Leu 


125 








Lys 


Leu 


lie 


Arg 


Glu 


Gly 


His 


lie 








160 


Ser 


Arg 


Val 


Arg 






175 




Tyr 


Gin 


Ala 


Ala 


190 






Leu 


Ala 


Val 


Val 


205 








Trp 


Ser 


Asp 


lie 


Gly Val 


Lys 


lie 








240 


He 


Ser 


Met 


Lys 






255 




Glu 


Thr 


lie 


He 




270 






Val 


Ser 


Arg 


Tyr 


285 








Glu Gly 


Asp 


Pro 


Leu 


Tyr 


Glu 


Lys 








320 


His 


Lys 


Ser 


Asp 






335 




Glu 


Trp 


Asp 


Lys 




350 







<210> 39 

<211> 876 

<212> DNA 

<213> Discosoma species 

<220> 

<221> CDS 

<222> (45) . . . (737) 

<223> Nucleotide sequence encoding red flourescent 
protein (FP593) 

<300> 

<308> GenBank AF272711 

<309> 2000-09-26 

<400> 39 

agtttcagcc agtgacaggg tgagctgcca ggtattctaa caag atg agt tgt tec 56 

Met Ser Cys Ser 
1 

aag aat gtg ate aag gag ttc atg agg ttc aag gtt cgt atg gaa gga 104 
Lys Asn Val He Lys Glu Phe Met Arg Phe Lys Val Arg Met Glu Gly 
5 10 15 20 

acg gtc aat ggg cac gag ttt gaa ata aaa ggc gaa ggt gaa ggg agg 152 
Thr Val Asn Gly His Glu Phe Glu lie Lys Gly Glu Gly Glu Gly Arg 
25 30 35 



cct tac gaa ggt cac 
Pro Tyr Glu Gly His 



tgt tec gta aag ctt atg gta ace aag ggt gga 
Cys Ser Val Lys Leu Met Val Thr Lys Gly Gly 



200 
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40 45 50 

cct ttg cca ttt get ttt gat att ttg tea cca caa ttt cag tat gga 24 8 
Pro Leu Pro Phe Ala Phe Asp lie Leu Ser Pro Gin Phe Gin Tyr Gly 
55 60 65 

age aag gta tat gtc aaa cac cct gee gac ata cca gac tat aaa aag 296 
Ser Lys Val Tyr Val Lys His Pro Ala Asp lie Pro Asp Tyr Lys Lys 
70 75 80 

ctg tea ttt cct gag gga ttt aaa tgg gaa agg gtc atg aac ttt gaa 344 
Leu Ser Phe Pro Glu Gly Phe Lys Trp Glu Arg Val Met Asn Phe Glu 
85 90 95 100 

gac ggt ggc gtg gtt act gta tec caa gat tec agt ttg aaa gac ggc 3 92 
Asp Gly Gly Val Val Thr Val Ser Gin Asp Ser Ser Leu Lys Asp Gly 
105 110 115 

tgt ttc ate tac gag gtc aag ttc att ggg gtg aac ttt cct tct gat 44 0 
Cys Phe lie Tyr Glu Val Lys Phe lie Gly Val Asn Phe Pro Ser Asp 
12 0 125 13 0 

gga cct gtt atg cag agg agg aca egg ggc tgg gaa gee age tct gag 488 
Gly Pro Val Met Gin Arg Arg Thr Arg Gly Trp Glu Ala Ser Ser Glu 
135 ~ 140 145 

cgt ttg tat cct cgt gat ggg gtg ctg aaa gga gac ate cat atg get 53 6 
Arg Leu Tyr Pro Arg Asp Gly Val Leu Lys Gly Asp lie His Met Ala 
150 155 160 

ctg agg ctg gaa gga ggc ggc cat tac etc gtt gaa ttc aaa agt att 5 84 
Leu Arg Leu Glu Gly Gly Gly His Tyr Leu Val Glu Phe Lys Ser lie 
165 170 175 180 

tac atg gta aag aag cct tea gtg cag ttg cca ggc tac tat tat gtt 632 
Tyr Met Val Lys Lys Pro Ser Val Gin Leu Pro Gly Tyr Tyr Tyr Val 
185 190 195 

gac tec aaa ctg gat atg acg age cac aac gaa gat tac aca gtc gtt 6 80 
Asp Ser Lys Leu Asp Met Thr Ser His Asn Glu Asp Tyr Thr Val Val 
200 205 210 

gag cag tat gaa aaa ace cag gga cgc cac cat ccg ttc att aag cct 72 8 
Glu Gin Tyr Glu Lys Thr Gin Gly Arg His His Pro Phe lie Lys Pro 
215 220 225 

ctg cag tga actcggctca gtcatggatt ageggtaatg gecacaaaag 77 7 

Leu Gin * 
230 

gcacgatgat cgttttttag gaatgeagee aaaaattgaa ggttatgaca gtagaaatac 837 
aagcaacagg etttgettat taaacatgta attgaaaac 876 

<210> 40 
<211> 230 
<212> PRT 

<213> Discosoma species 
<400> 40 

Met Ser Cys Ser Lys Asn Val lie Lys Glu Phe Met Arg Phe Lys Val 

15 10 15 

Arg Met Glu Gly Thr Val Asn Gly His Glu Phe Glu lie Lys Gly Glu 

20 25 30 

Gly Glu Gly Arg Pro Tyr Glu Gly His Cys Ser Val Lys Leu Met Val 

35 40 45 

Thr Lys Gly Gly Pro Leu Pro Phe Ala Phe Asp lie Leu Ser Pro Gin 
50 ' 55 6 0 
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Plie 


Gin 


Tyr 


Gly 


Ser 


Lys 


Val 


Tyr 


Val 


Lys 


His 


Pro 


Ala 


Asp 


lie 


Pro 


65 








70 










75 










80 


Asp 


Tyr 


Lys 


Lys 


Leu 


Ser 


Phe 


Pro 


Glu 


Gly 


Phe 


Lys 


Trp 


Glu 


Arg 


Val 




85 










90 










95 




Met 


Asn 


Phe 


Glu 


Asp 


Gly 


Gly Val 


Val 


Thr 


Val 


Ser 


Gin 


Asp 


Ser 


Ser 








100 










105 










110 






Leu 


Lys 


Asp 


Gly 


Cys 


Phe 


He 


Tyr 


Glu 


Val 


Lys 


Phe 


lie 


Gly 


Val 


Asn 




115 










120 










125 








Phe 


Pro 


Ser 


Asp 


Gly 


Pro 


Val 


Met 


Gin 


Arg 


Arg 


Thr 


Arg 


Gly 


Trp 


Glu 




130 






135 










140 










Ala 


Ser 


Ser 


Glu 


Arg 


Leu 


Tyr 


Pro Arg Asp 


Gly 


Val 


Leu 


Lys 


Gly 


Asp 


145 










150 










155 










160 


lie 


His 


Met 


Ala 


Leu 


Arg 


Leu 


Glu 


Gly 


Gly 


Gly 


His 


Tyr 


Leu 


Val 


Glu 










165 








170 










175 




Phe 


Lys 


Ser 


lie 


Tyr 


Met 


Val 


Lys 


Lys 


Pro 


Ser 


Val 


Gin 


Leu 


Pro 


Gly 






180 










185 










190 






Tyr 


Tyr 


Tyr 


Val 


Asp 


Ser 


Lys 


Leu 


Asp 


Met 


Thr 


Ser 


His 


Asn 


Glu 


Asp 


195 










200 










205 








Tyr 


Thr 


Val 


Val 


Glu 


Gin 


Tyr 


Glu 


Lys 


Thr Gin Gly Arg 


His 


His 


Pro 


210 










215 










220 










Phe 


lie 


Lys 


Pro 


Leu 


Gin 






















225 








230 























<210> 41 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> m-att; 

< 2 2 1 > misc_dif ference 
<222> 18 

<223> n is a or g or c or t/u 
<400> 41 

rkycwgcttt yktrtacnaa stsgb 25 

<210> 42 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> m-attB ; 

<221> misc_dif f erence 
<222> 18 

<223> n is a or g or c or t/u 
<400> 42 

agccwgcttt yktrtacnaa ctsgb 25 

<210> 43 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> m-attR 



<221> misc^di f f erence 
<222> 18 

<223> n is a or g or c or t/u 



WO 02/097059 PCT7US02/17452 

-26- 

<400> 43 

gttcagcttt cktrtacnaa ctsgb 25 

<210> 44 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> m-attL 

<221> misc_difference 
<222> 18 

<223> n is a or g or c or fc/u 
<400> 44 

agccwgcttt cktrtacnaa gtegb 25 

<210> 45 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> m-attPl 

<221> misc__di£f erence 
<222> 18 

<223> n is a or g or c or t/u 
<400> 45 

gttcagcttt yktrtacnaa gtsgb 25 

<210> 46 

<211> 25 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attBl 

<400> 46 

agcctgcttt tttgtacaaa cttgt 25 

<210> 47 

<211> 25 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attB2 

<400> 47 

agcctgcttt cttgtacaaa cttgt 25 

<210> 48 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attB3 
<400> 48 

acccagcttt cttgtacaaa cttgt 25 



<210> 49 
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<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attRl 
<400> 49 

gttcagcttt tttgtacaaa cttgt 25 

<210> 50 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attR2 
<400> 50 

gttcagcttt cttgtacaaa cttgt 25 

<210> 51 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attR3 
<400> 51 

gttcagcttt cttgtacaaa gttgg 25 

<210> 52 

<211=> 25 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attLl 

<400> 52 

agcctgcttt tttgtacaaa gttgg 25 

<210> 53 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attL2 
<400> 53 

agcctgcttt cttgtacaaa gttgg 25 

<210> 54 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attL3 



<:400> 54 

acccagcttt cttgtacaaa gttgg 



25 



<210> 55 
<211> 25 
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<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attPl 
<400> 55 

gttcagcttt tttgtacaaa gttgg 25 

<210> 56 

<211> 25 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attP2,P3 

<400> 56 

gttcagcttt cttgtacaaa gttgg 25 

<210> 57 
<211> 34 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Lax P site 
<400> 57 

ataacttcgt ataatgtatg ctatacgaag ttat 34 

<210> 58 
<211> 1032 
<212> DNA 

<213> Escherichia coli 

<220> 

<2 21> CDS 

<222> (1) . . . (1032) 

<223> nucleotide sequence encoding Cre recombinase 
<400> 58 

atg tec aat tta ctg acc gta cac caa aat ttg cct gca tta ccg gtc 48 
Met Ser Asn Leu Leu Thr Val His Gin Asn Leu Pro Ala Leu Pro Val 
15 10 15 

gat gca acg agt gat gag gtt cgc aag aac ctg atg gac atg ttc agg 96 
Asp Ala Thr Ser Asp Glu Val Arg Lys Asn Leu Met Asp Met Phe Arg 
20 25 30 

gat cgc cag gcg ttt tct gag cat acc tgg aaa atg ctt ctg tec gtt 144 
Asp Arg Gin Ala Phe Ser Glu His Thr Trp Lys Met Leu Leu Ser Val 
35 40 45 

tgc egg teg tgg gcg gca tgg tgc aag ttg aat aac egg aaa tgg ttt 192 
Cys Arg Ser Trp Ala Ala Trp Cys Lys Leu Asn Asn Arg Lys Trp Phe 
50 55 60 

ccc gca gaa cct gaa gat gtt cgc gat tat ctt eta tat ctt cag gcg 24 0 

Pro Ala Glu Pro Glu Asp Val Arg Asp Tyr Leu Leu. Tyr Leu Gin Ala 
65 70 - . ~ 75 8Q 

cgc ggt ctg gca gta aaa act ate cag caa cat ttg ggc cag eta aac 28 8 

Arg Gly Leu Ala Val Lys Thr lie Gin Gin His Leu Gly Gin Leu Asn 
85 90 95 

atg ctt cat cgt egg tec ggg ctg cca cga cca agt gac age aat get 33 6 
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Met Leu His Arg Arg Ser Gly Leu Pro Arg Pro Sex Asp Sear Asn Ala 
100 ~ 105 110 

gtt tea ctg gtt atg egg egg ate cga aaa gaa aac gtt gat gec ggt 3 84 

Val Ser Leu Val Met Arg Arg lie Arg Lye Glu Asn Val Asp Ala Gly 
115 120 125 

gaa cgt gca aaa cag get eta gcg ttc gaa cgc act gat ttc gac cag 432 
Glu Arg Ala Lys Gin Ala Leu Ala Phe Glu Arg Thr Asp Phe Asp Gin 
130 135 140 

gtt cgt tea etc atg gaa aat age gat cgc tgc cag gat ata cgt aat 4 80 

Val Arg Ser Leu Met Glu Asn Ser Asp Arg Cys Gin Asp He Arg Asn 
145 150 155 160 

ctg gca ttt ctg ggg att get tat aac acc ctg tta cgt ata gee gaa 528 
Leu Ala Phe Leu Gly He Ala Tyr Asn Thr Leu Leu Arg lie Ala Glu 
165 170 175 

att gec agg ate agg gtt aaa gat ate tea cgt act gac ggt ggg aga 576 
lie Ala Arg lie Arg Val Lys Asp lie Ser Arg Thr Asp Gly Gly Arg 
180 185 190 

atg tta ate cat att ggc aga acg aaa acg ctg gtt age acc gca ggt 624 
Met Leu lie His lie Gly Arg Thr Lys Thr Leu Val Ser Thr Ala Gly 
195 200 205 

gta gag aag gca ctt age ctg ggg gta act aaa ctg gtc gag cga tgg 672 
Val Glu Lys Ala Leu Ser Leu Gly Val Thr Lys Leu Val Glu Arg Trp 
210 215 220 

att tec gtc tct ggt gta get gat gat ccg aat aac tac ctg ttt tgc 720 
lie Ser Val Ser Gly Val Ala Asp Asp Pro Asn Asn Tyr Leu Phe Cys 
225 230 235 240 

egg gtc aga aaa aat ggt gtt gec gcg cca tct gee acc age cag eta 768 
Arg Val Arg Lys Asn Gly Val Ala Ala Pro Ser Ala Thr Ser Gin Leu 
245 250 255 

tea act cgc gec ctg gaa ggg att ttt gaa gca act cat cga ttg att 816 
Ser Thr Arg Ala Leu Glu Gly He Phe Glu Ala Thr His Arg Leu lie 
260 265 270 

tac ggc get aag gat gac tct ggt cag aga tac ctg gee tgg tct gga 864 
Tyr Gly Ala Lys Asp Asp Ser Gly Gin Arg Tyr Leu Ala Trp Ser Gly 
275 280 285 

cac agt gec cgt gtc gga gee gcg cga gat atg gee cgc get gga gtt 912 
His Ser Ala Arg Val Gly Ala Ala Arg Asp Met Ala Arg Ala Gly Val 
290 295 300 

tea ata ccg gag ate atg caa get ggt ggc tgg acc aat gta aat att 960 
Ser lie Pro Glu lie Met Gin Ala Gly Gly Trp Thr Asn Val Asn lie 
305 310 315 320 

gtc atg aac tat ate cgt aac ctg gat agt gaa aca ggg gca atg gtg 1008 
Val Met Asn Tyr lie Arg Asn Leu Asp Ser Glu Thr Gly Ala Met Val 
325 ~ 330 335 

cgc ctg ctg gaa gat ggc gat tag 1032 
Arg Leu Leu Glu Asp Gly Asp * 
340 



<210> 59 
<211> 343 
<212> PRT 



WO 02/097059 



PCT/US02/17452 



-30- 

<213> Escherichia coli 



<400> 59 



Met 


Ser 


Asn 


Leu 


Leu 


Thr 


Val 


His 


Gin 


Asn 


Leu 


Pro 


Ala 


Leu 


Pro 


Val 


i 
j_ 








c 










10 










15 




Asp 




Thr 


Ser 


Asp 


Glu 


Val 




Lys 


Asn 


Leu 


Met 


Asd 


Met 


Phe 


Arg 






^ u 








25 










30 








Arg 


Gin 


Ala 


Phe 


Ser 


Glu 


His 


Thr 


Tro 


Lys 


Met 


Leu 


Leu 


Ser 


Val 












40 










45 










_ 

Arg 


Ser 




Ala 


Ala 


Trp 


Cys 


Lys 


Leu 


Asn 


Asn 


Ara 


Lys 




Phe 


CZ. ft 










55 










60 










Pro 


Ala 


Glu 




Glu 


Asp 


Val 


Arg 


Asp 


Tvr 


Leu 


Leu 


Tvr 


Leu 


Gin 


Ala 












70 










75 










80 


Arg 




Leu 


Ala 


Val 


Lys 


Thr 


He 


Gin 


Gin 


His 


Leu 


Glv 


Gin 


Leu 


Asn 














90 










95 




l'lct 


Lgu 


His 


Arg 


Ar*g 


Ser 


Gly 


Leu 


Pro 


Arg 


Pro 


Ser 


Asp 


Ser 


Asn 


Ala 








1UU 






1U J 










110 






Val 


Ser 


Leu 


V dl 




Arg 


Arg 


He 


Arg 


Lys 


Glu 


Asn 


Val 


Asp 


Ala 


Gly 






XXty 








i on 










125 








Glu 


Arg 


Ala 


Lys 


Gin 


Ala 


Leu 


Ala 


Phe 


Glu 


Arg 


Thr 


Asp 


Phe 


Asp 


Gin 




130 








135 




















vai 


Arg 


ber 


Leu 


TV/I «-*•*— 

net 


Glu 


Asn 


er 


Asp 


Arg 


Cys 


Gin 


Asp 


He 


Ai y 


Asn 


-i /» rr 








15 0 










155 










160 


Lsu 




±rXXd 


Leu 


uiy 

-LOO 


He 


Ala 




Asn 


Thr 
170 


Leu 


Leu 


Ara 


lie 


Ala 
175 


Glu 


lie 


Ala 


Arg 


He 


Arg 


Val 


Lys 


Asp 


He 


Ser 


Arg 


Thr 


Asp 


Gly 


Gly 


Arg 






180 










185 










190 






Met 


Leu 


He 


His 


He 


Gly Arg 


Thr 


Lys 


Thr 


Leu 


Val 


Ser 


Thr 


Ala 


Gly 






195 










200 










205 








Val 


Glu 


Lys 


Ala 


Leu 


Ser 


Leu 


Gly 


Val 


Thr 


Lys 


Leu 


Val 


Glu 


Arg 


Trp 




210 








215 










220 










lie 


Ser 


Val 


Ser 


Gly 


Val 


Ala 


Asp 


Asp 


Pro 


Asn 


Asn 


Tyr 


Leu 


Phe 


Cys 


225 










230 










235 










240 


Arg 


Val 


Arg 


Lys 


Asn 


Gly 


Val 


Ala 


Ala 


Pro 


Ser 


Ala 


Thr 


Ser 


Gin 


Leu 




245 










250 










255 




Ser 


Thr 


Arg 


Ala 


Leu 


Glu 


Gly 


He 


Phe 


Glu 


Ala 


Thr 


His 


Arg 


Leu 


lie 






260 








265 










270 






Tyr 


Gly 


Ala 


Lys 


Asp 


Asp 


Ser 


Gly 


Gin 


Arg 


Tyr 


Leu 


Ala 


Trp 


Ser 


Gly 




275 










280 










285 








His 


Ser 
290 


Ala 


Arg 


Val 


Gly 


Ala 
295 


Ala 


Arg 


Asp 


Met 


Ala 
300 


Arg 


Ala 


Gly 


Val 


Ser 


He 


Pro 


Glu 


He 


Met 


Gin 


Ala 


Gly Gly 


Trp 


Thr 


Asn 


Val 


Asn 


lie 


305 










310 










315 










320 


Val 


Met 


Asn 


Tyr 


He 


Arg 


Asn 


Leu 


Asp 


Ser 


Glu 


Thr 


Gly 


Ala 


Met 


Val 








325 








330 










335 




Arg 


Leu 


Leu 


Glu 


Asp 


Gly Asp 





















340 



<210> 60 
<211> 1272 
<212> DNA 

<213> Saccharomyces cerevisiae 

<220> 

<221> CDS 

<222> (1) . . . (1272) 

<223> nucleotide sequence encoding Flip recombinase 
<400> 60 

atg cca caa ttt ggt ata tta tgt aaa aca cca cct aag gtg ctt gtt 48 
Met Pro Gin Phe Gly lie Leu Cys Lys Thr Pro Pro Lys Val Leu Val 
15 10 15 



cgt cag ttt gtg gaa agg ttt gaa aga cct tea ggt gag aaa ata gca 96 
Arg Gin Phe Val Glu Arg Phe Glu Arg Pro Ser Gly Glu. Lys He Ala 
20 25 30 
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tta tgt get get gaa eta acc tat tta tgt tgg atg att aca cat aac 144 
Leu Cys Ala Ala Glu Leu Thr Tyr Leu Cys Trp Met lie Thr His Asn 
35 40 45 

gga aca gca ate aag aga gee aca ttc atg age tat aat act ate ata 192 
Gly Thr Ala lie Lys Arg Ala Thr Phe Met Ser Tyr Asn Thar lie lie 
50 55 60 

age aat teg ctg agt ttc gat att gtc aat aaa tea etc cag ttt aaa 240 
Ser Asn Ser Leu Ser Phe Asp lie Val Asn Lys Ser Leu Gin Phe Lys 
65 70 75 80 

tac aag acg caa aaa gca aca att ctg gaa gee tea tta aag aaa ttg 2 88 

Tyr Lys Thr Gin Lys Ala Thr He Leu Glu Ala Ser Leu Lys Lys Leu 
85 90 95 

att cct get tgg gaa ttt aca att att cct tac tat gga caa aaa cat 3 36 

He Pro Ala Trp Glu Phe Thr lie lie Pro Tyr Tyr Gly Gin Lys His 
100 105 110 

caa tct gat ate act gat att gta agt agt ttg caa tta cag ttc gaa 3 84 

Gin Ser Asp Tie Thr Asp lie Val Ser Ser Leu Gin Leu Gin Phe Glu 
115 120 125 

tea teg gaa gaa gca gat aag gga aat age cac agt aaa aaa atg ctt 432 
Ser Ser Glu Glu Ala Asp Lys Gly Asn Ser His Ser Lys Lys Met Leu 
130 135 140 

aaa gca ctt eta agt gag ggt gaa age ate tgg gag ate act gag aaa 4 80 

Lys Ala Leu Leu Ser Glu Gly Glu Ser lie Trp Glu lie Thr Glu Lys 
145 150 155 160 

ata eta aat teg ttt gag tat act teg aga ttt aca aaa aca aaa act 52 8 

lie Leu Asn Ser Phe Glu Tyr Thr Ser Arg Phe Thr Lys Thr Lys Thr 
165 170 175 

tta tac caa ttc etc ttc eta get act ttc ate aat tgt gga aga ttc 576 
Leu Tyr Gin Phe Leu Phe Leu Ala Thr Phe lie Asn Cys Gly Arg Phe 
180 185 190 

age gat att aag aac gtt gat ccg aaa tea ttt aaa tta gtc caa aat 624 
Ser Asp He Lys Asn Val Asp Pro Lys Ser Phe Lys Leu Val Gin Asn 
195 200 205 

aag tat ctg gga gta ata ate cag tgt tta gtg aca gag aca aag aca 672 
Lys Tyr Leu Gly Val lie lie Gin Cys Leu Val Thr Glu Thr Lys Thr 
210 215 220 

age gtt agt agg cac ata tac ttc ttt age gca agg ggt agg ate gat 720 
Ser Val Ser Arg His lie Tyr Phe Phe Ser Ala Arg Gly Arg lie Asp 
225 " 230 235 240 

cca ctt gta tat ttg gat gaa ttt ttg agg aat tct gaa cca gtc eta 768 
Pro Leu Val Tyr Leu Asp Glu Phe Leu Arg Asn Ser Glu Pro Val Leu 
245 250 255 

aaa cga gta aat agg acc ggc aat tct tea age aat aaa cag gaa tac 816 
Lys Arg Val Asn Arg Thr Gly Asn Ser Ser Ser Asn Lys Gin Glu Tyr 
260 265 270 

caa tta tta aaa gat aac tta gtc aga teg tac aat aaa get ttg aag 864 
Gin Leu Leu Lys Asp Asn Leu Val Arg Ser Tyr Asn Lys Ala Leu Lys 
275 * 280 285 

aaa aat gcg cct tat tea ate ttt get ata aaa aat ggc cca aaa tct 912 
Lys Asn Ala Pro Tyr Ser lie Phe Ala lie Lys Asn Gly Pro Lys Ser 
290 295 300 
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cac att gga aga cat ttg atg acc tea ttt ctt tea atg aag ggc eta 960 

His lie Gly Arg His Leu Met Thr Ser Phe Leu Ser Met Lys Gly Leu 
305 3X0 315 320 

acg gag ttg act aat gtt gtg gga aat tgg age gat aag cgt get tct 10 08 

Thr Glu Leu Thr Asn Val Val Gly Asn Trp Ser Asp Lys Airg Ala Ser 
325 330 335 

gec gtg gee agg aca acg tat act cat cag ata aca gca ata cct gat 1056 
Ala Val Ala Arg Thr Thr Tyr Thr His Gin He Thr Ala He Pro Asp 
340 345 350 

cac tac ttc gca eta gtt tct egg tac tat gca tat gat cca ata tea 1104 
His Tyr Phe Ala Leu Val Ser Arg Tyr Tyr Ala Tyr Asp Pro He Ser 
355 360 365 

aag gaa atg ata gca ttg aag gat gag act aat cca att gag gag tgg 1152 
Lys Glu Met lie Ala Leu Lys Asp Glu Thr Asn Pro lie Glu Glu Trp 
370 375 380 

cag cat ata gaa cag eta aag ggt agt get gaa gga age ata cga tac 1200 
Gin His He Glu Gin Leu Lys Gly Ser Ala Glu Gly Ser He Arg Tyr 
385 390 395 400 

ccc gca tgg aat ggg ata ata tea cag gag gta eta gac tac ctt tea 1248 
Pro Ala Trp Asn Gly He He Ser Gin Glu Val Leu Asp Tyr Leu Ser 
405 410 415 

tec tac ata aat aga cgc ata taa 1272 
Ser Tyr He Asn Arg Arg He * 
420 

<210> 61 
<211> 422 
<212> PRT 

<213> Saccharomyces cerevisiae 



<400> 61 




Pro 


Gin 


Phe 


Gly 


1 








Gin 


Phe 


Val 


Glu 








20 


Cys 


Ala 


Ala 


Glu 






35 




Thr 


Ala 


He 


Lys 




50 




Asn 


Ser 


Leu 


Ser 


65 








Lys 


Thr 


Gin 


Lys 


Pro 


Ala 


Trp 


Glu 








100 


Ser 


Asp 


He 


Thr 






115 




Ser 


Glu 


Glu 


Ala 




130 






Ala 


Leu 


Leu 


Ser 


145 








Leu 


Asn 


Ser 


Phe 


Tyr 


Gin 


Phe 


Leu 








180 


Asp 


He 


Lys 


Asn 






195 




Tyr 


Leu 


Gly 


Val 



He 


Leu 


Cys 


Lys 


5 








Arg 


Phe 


Glu 


Arg 


Leu 


Thr 


Tyr 


Leu 








40 


Arg 


Ala 


Thr 


Phe 






55 




Phe 


Asp 


He 


Val 




70 






Ala 


Thr 


He 


Leu 


85 








Phe 


Thr 


He 


He 


Asp 


He 


Val 


Ser 








120 


Asp 


Lys 


Gly 


Asn 






135 




Glu 


Gly 


Glu 


Ser 




150 






Glu 


Tyr 


Thr 


Ser 


165 








Phe 


Leu 


Ala 


Thr 


Val 


Asp 


Pro 


Lys 








200 


He 


He 


Gin 


Cys 



Thr 


Pro 


Pro 


Lys 




10 






Pro 


Ser 


Gly 


Glu 


25 








Cys 


Trp 


Met 


He 


Met 


Ser 


Tyr 


Asn 








60 


Asn 


Lys 


Ser 


Leu 






75 




Glu 


Ala 


Ser 


Leu 




90 






Pro 


Tyr 


Tyr 


Gly 


105 








Ser 


Leu 


Gin 


Leu 


Ser 


His 


Ser 


Lys 








140 


He 


Trp 


Glu 


He 






155 




Arg 


Phe 


Thr 


Lys 




170 






Phe 


He 


Asn 


Cys 


185 








Ser 


Phe 


Lys 


Leu 


Leu 


Val 


Thr 


Glu 



Val 


Leu 


Val 


Arg 






15 




Lys 


He 


Ala 


Leu 




30 






Thr 


His 


Asn 


Gly 


45 








Thr 


He 


He 


Ser 


Gin 


Phe 


Lys 


Tyr 








80 


Lys 


Lys 


Leu 


He 






95 




Gin 


Lys 


His 


Gin 




HO 






Gin 


Phe 


Glu 


Ser 


125 








Lys 


Met 


Leu 


Lys 


Thr 


Glu 


Lys 


He 








160 


Thr 


Lys 


Thr 


Leu 




175 




Gly 


Arg 


Phe 


Ser 




190 






Val 


Gin 


Asn 


Lys 


205 








Thr 


Lye 


Thr 


Ser 
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/ -LU 










215 










220 










Val 


Ser 


Arg 


XT -4 e» 

nlS 


jl J_e 


Tyr 


rile 


Jrlie 


ser 




Arg 


Gly 


- 


He 


Asp 


Pro 


225 








230 










235 










240 


Leu 


T/ra 1 


Tyr 


Leu 


Asp 


uXU 




Leu 


Arg 




Ser 


Glu 


Pro 


Val 


Leu 


Lys 


















250 










2 55 




Arg 


vai 


Asn 


Arg 


«-pV-i ■>- 


t»j.y 


Asn 


Ser 


bGr 


Ser 


Asn 


Lys 


Gin 


Glu 




Gin 






260 








2 65 










270 






L6U 


L6U 


Lys 
275 


Asp 


Asn 


Leu 


val 


Arg 
280 


Ser 


xyr 


Asn 


Lys 


Ala 
285 


Leu 


Lys 




Asn 


H.la 


Pro 


Tyr 


Ser 


T 1 e=> 

J. J.U 






He 




Asn Gly 


Pro 


Lys 


Ser 


His 




290 








2 95 










300 










lie 


Gly 


Arg 


His 


Leu 


Met 


Thr 


Ser 


Phe 


Leu 


Ser 


Met 


Lys 


Gly 


Leu 


Thr 


305 






310 










315 










320 


Glu 


Leu 


Thr 


Asn 


Val 


Val 


Gly 


Asn 


Trp 


Ser 


Asp 


Lys 


Arg 


Ala 


Ser 


Ala 










325 








330 










335 




Val 


Ala 


Arg 


Tnr 


Tnr 


Tyr 


Thr 


His 


Gin 


He 


Thr 


Ala 


He 


Pro 


Asp 


His 






340 








345 










350 






Tyr 


Plie 


Ala 


Leu 


Val 


Ser 


Arg 


Tyr 


Tyr 


Ala 


Tyr Asp 


Pro 


He 


Ser 


Lys 




355 










360 










365 








Glu 


Met 
370 


He 


Ala 


Leu 


Lys 


Asp 
375 


Glu 


Thr 


Asn 


Pro 


He 
380 


Glu 


Glu 


Trp 


Gin 


His 


lie 


Glu 


Gin 


Leu 


Lys 


Gly 


Ser 


Ala 


Glu 


Gly 


Ser 


He 


Arg 


Tyr 


Pro 


385 










390 








395 










400 


Ala 


Trp 


Asn 


Gly 


He 


lie 


Ser 


Gin 


Glu 


Val 


Leu 


Asp 


Tyr 


Leu 


Ser 


Ser 






405 










410 










415 




Tyr 


lie 


Asn 


Arg 
420 


Arg 


lie 























<210> 62 

<211> 48 

<212> DNA 

<213> Artificial Sequence 

<220> 

<223> IR2 

<400> 62 

gaagttccta ttccgaagtt cctattctct agaaagtata ggaacttc 48 

<210> 63 

<211> 48 

<212> DNA 

<213> Artificial Sequence 

<220> 

<223> IR1 

<400> 63 

gaagttccta tactttctag agaataggaa etteggaata ggaacttc 48 

<210> 64 

<211> 66 

<212> DNA 

<213> Bacteriophage mu 

<220> 

<221> CDS 

<222> (1) . . . (66) 

<223> nucleotide sequence encoding GIN recombinase 

<400> 64 

tea act ctg tat aaa aaa cac ccc gcg aaa cga gcg cat ata gaa aac 4 8 

Ser Thr Leu Tyr Lys Lys His Pro Ala Lys Arg Ala His He Glu Asn 
1 5 10 15 



gac gat cga ate aat taa 
Asp Asp Arg He Asn * 



66 
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20 

<210> 65 
<211> 21 
<212> PRT 

<213> bacteriophage tnu 
<400> 65 

Ser Thr Leu Tyr Lys Lys His Pro Ala Lye Arg Ala His He Glu Asn 

1 5 10 15 

Asp Asp Arg He Asn 
20 

<210> 66 
<211> 69 
<212> DNA 

<213> Bacteriophage imi 

<220> 

<221> CDS 

<222> (1) . . . (69) 

<22 3> nucleotide sequence encoding Gin recombinase 
<400> 66 

tat aaa aaa cat ccc gcg aaa cga acg cat at a gaa aac gac gat cga 48 
Tyr Lys Lys His Pro Ala Lys Arg Thr His lie Glu Asn Asp Asp Arg 
15 10 15 

ate aat caa ate gat egg taa 69 
lie Asn Gin lie Asp Arg * 
20 

<210> 67 
<211> 22 
<212> PRT 

<2 13 > bacteriophage mu 
<220> 

<223> Gin recombinase of bacteriophage mu 
<400> 67 

Tyr Lys Lys His Pro Ala Lys Arg Thr His lie Glu Asn Asp Asp Arg 

15 10 15 

lie Asn Gin lie Asp Arg 
20 

<210> 68 
<211> 555 
<212> DNA 

<213> Escherichia coli 

<220> 

<2 21> CDS 

<222> (1) . . . (555) 

<2 23> nucleotide sequence encoding PIN recombinase 
<400> 68 

atg ctt att ggc tat gta cgc gta tea aca aat gac cag aac aca gat 4 8 

Met Leu He Gly Tyr Val Arg Val Sex Thr Asn Asp Gin Asn Thr Asp 
1 5 10 " 15 

eta caa cgt aat gcg ctg aac tgt gca gga tgc gag ctg att ttt gaa 96 
Leu Gin Arg Asn Ala Leu Asn Cys Ala Gly Cys Glu Leu lie Phe Glu 
20 25 30 
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gac aag ata age ggc aca aag tec gaa agg ccg gga ctg aaa aaa ctg 144 

Asp Lys lie Ser Gly Thr Lys Ser Glu Arg Pro Gly Leu Lys Lys Leu 
35 40 45 

etc agg aca tta teg gca ggt gac act ctg gtt gtc tgg aag ctg gat 192 
Leu Arg Thr Leu Ser Ala Gly Asp Thr Leu Val Val Trp Lys Leu Asp 
50 55 60 

egg ctg ggg cgt agt atg egg cat ctt gtc gtg ctg gtg gag gag ttg 240 
Arg Leu Gly Arg Ser Met Arg His Leu Val Val Leu Val Glu Glu Leu 
65 " 70 75 80 

cgc gaa cga ggc ate aac ttt cgt agt ctg acg gat tea att gat ace 2 88 

Arg Glu Arg Gly Xle Asn Phe Arg Ser Leu Thr Asp Ser lie Asp Thr 
85 90 95 

age aca cca atg gga cgc ttt ttc ttt cat gtg atg ggt gee ctg get 336 
Ser Thr Pro Met Gly Arg Phe Phe Phe His Val Met Gly Ala Leu Ala 
1O0 105 110 

gaa atg gag cgt gaa ctg att gtt gaa cga aca aaa get gga ctg gaa 3 84 

Glu Met Glu Arg Glu Leu lie Val Glu Arg Thr Lys Ala Gly Leu Glu 
115 120 125 

act get cgt gca cag gga cga att ggt gga cgt cgt ccc aaa ctt aca 432 
Thr Ala Arg Ala Gin Gly Arg lie Gly Gly Arg Arg Pro Lys Leu Thr 
130 135 " 140 

cca gaa caa tgg gca caa get gga cga tta att gca gca gga act cct 4 80 

Pro Glu Gin Trp Ala Gin Ala Gly Arg Leu lie Ala Ala Gly Thr Pro 
145 150 155 160 

cgc cag aag gtg gcg att ate tat gat gtt ggt gtg tea act ttg tat 52 8 

Arg Gin Lys Val Ala lie lie Tyr Asp Val Gly Val Ser Thr Leu Tyr 
165 ~ 170 175 

aag agg ttt cct gca ggg gat aaa taa 555 
Lys Arg Phe Pro Ala Gly Asp Lys * 
180 

<210> 69 
<211> 184 
<212> PRT 

<213 > Escherichia coli 
<400> 69 

Met Leu He Gly Tyr Val Arg Val Ser Thr Asn Asp Gin Asn Thr Asp 

1 5 ~ 10 15 

Leu Gin Arg Asn Ala Leu Asn Cys Ala Gly Cys Glu Leu lie Phe Glu 

20 25 30 

Asp Lys He Ser Gly Thr Lys Ser Glu Arg Pro Gly Leu Lys Lys Leu 

35 40 45 

Leu Arg Thr Leu Ser Ala Gly Asp Thr Leu Val Val Trp Lys Leu Asp 

50 55 60 

Arg Leu Gly Arg Ser Met Arg His Leu Val Val Leu Val Glu Glu Leu 
65 70 " 75 80 

Arg Glu Arg Gly lie Asn Phe Arg Ser Leu Thr Asp Ser lie Asp Thr 

85 90 95 

Ser Thr Pro Met Gly Arg Phe Phe Phe His Val Met Gly Ala Leu Ala 

100 ~* 105 110 

Glu Met Glu Arg Glu Leu lie Val Glu Arg Thr Lys Ala Gly Leu Glu 

115 120 125 

Thr Ala Arg Ala Gin Gly Arg lie Gly Gly Arg Arg Pro Lys Leu Thr 

130 135 140 

Pro Glu Gin Trp Ala Gin Ala Gly Arg Leu lie Ala Ala Gly Thr Pro 
145 150 " 155 160 
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Arg Gin Lys Val Ala lie lie Tyr Asp Val Gly Val Ser Thr Leu Tyr 

165 170 175 

Lys Arg Phe Pro Ala Gly Asp Lys 
180 

<210> 70 
<211> 4778 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pcx plasmid 



<400> 70 

gtcgacattg attattgact agttattaat agtaatcaat tacggggtca ttagttcata 60 

gcccatatat ggagttccgc gttacataac ttacggtaaa tggcccgcct ggctgaccgc 12 0 

ccaacgaccc ccgcccattg acgtcaataa tgacgtatgt tcccatagta acgccaatag 180 

ggactttcca ttgacgtcaa tgggtggact atttacggta aactgcccac ttggcagtac 24 0 

atcaagtgta tcatatgcca agtacgcccc ctattgacgt caatgacggt aaatggcccg 30 0 

cctggcatta tgcccagtac atgaccttat gggactttcc tacttggcag tacatctacg 360 

tattagtcat cgctattacc atgggtcgag gtgagcccca cgttctgctt cactctcccc 420 

atctcccccc cctccccacc cccaattttg tatttattta ttfctttaatt attttgtgca 480 

gcgatggggg cggggggggg gggggcgcgc gccaggcggg gcggggcggg gcgaggggcg 54 0 

999 C 9T999 C 9 aggcggagag gtgcggcggc agccaatcag agcggcgcgc tccgaaagtt 60 0 

tccttttatg gcgaggcggc ggcggcggcg gccctataaa aagcgaagcg cgcggcgggc 660 

gggagtcgct gcgttgcctt cgccccgtgc cccgctccgc gccgcctcgc gccgcccgcc 72 0 

ccggctctga ctgaccgcgt tactcccaca ggtgagcggg cgggacggcc cttctcctcc 780 

gggctgtaat tagcgcttgg tttaatgacg gctcgtttct tttctgtggc tgcgtgaaag 840 

ccttaaaggg ctccgggagg gccctttgtg cgggggggag cggctcgggg ggtgcgtgcg 90 0 

tgtgtgtgtg cgtggggagc gccgcgtgcg gcccgcgctg cccggcggct gtgagcgctg 960 

cgggcgcggc gcggggcttt gtgcgctccg cgtgtgcgcg aggggagcgc ggccgggggc 1020 

ggtgccccgc ggtgcggggg ggctgcgagg ggaacaaagg ctgcgtgcgg ggtgtgtgcg 1080 

tgggggggtg agcagggggt gtg99cgcgg cggtcgggct gtaacccccc cctgcacccc 114 0 

cctccccgag. ttgctgagca cggcccggct tcgggtgcgg ggctccgtgc ggggcgtggc 120 0 

gcggggctcg ccgtgccggg cggggggtgg cggcaggtgg gggtgccggg cggggcgggg 1260 

ccgcctcggg ccggggaggg ctcgggggag gggcgcggcg gccccggagc gccggcggct 1320 

gtcgaggcgc ggcgagccgc agccattgcc ttttatggta atcgtgcgag agggcgcagg 13 80 

gacttccttt gtcccaaatc tggcggagcc gaaatctggg aggcgccgcc gcaccccctc 144 0 

tagcgggcgc gggcgaagcg gtgcggcgcc ggcaggaagg aaatgggcgg ggagggcctt 150 0 

cgtgcgtcgc cgcgccgccg tccccttctc catctccagc ctcggggctg ccgcaggggg 1560. 

acggctgcct tcggggggga cggggcaggg cggggttcgg cttctggcgt gtgaccggcg 1620 

gctctagagc ctctgctaac catgttcatg ccttcttctt tttcctacag ctcctgggca 1680 

acgtgctggt tgttgtgctg tctcatcatt ttggcaaaga attcactcct caggtgcagg 1740 

ctgcctatca gaaggtggtg gcfcggtgtgg ccaatgccct ggctcacaaa taccactgag 1800 

atctttttcc ctctgccaaa aattatgggg acatcatgaa gccccttgag catctgactt 1860 

ctggctaata aaggaaattt attttcattg caatagtgtg ttggaatttt ttgtgtctct 1920 

cactcggaag gacatatggg agggcaaatc atttaaaaca tcagaatgag tatttggttt 1980 

agagtttggc aacatatgcc atatgctggc tgccatgaac aaaggtggct ataaagaggt 2 04 0 

catcagtata tgaaacagcc ccctgctgtc cat tec t tat tccatagaaa agecttgact 2100 

tgaggttaga ttttttttat attttgtttt gtgttatttt tttctttaac atccctaaaa 2160 

ttttccttac atgttttact agecagattt ttcctcctct cctgactact cccagtcata 2220 

gctgtccctc ttctcttatg aagatccctc gacctgcagc ccaagcttgg cgtaatcatg 2280 

gtcatagctg tttcctgtgt gaaattgtta tccgctcaca attccacaca acatacgagc 2340 

eggaagcata aagtgtaaag cctggggtgc ctaatgagtg agctaactca cattaattgc 2400 

gttgegctea ctgcccgctt tecagteggg aaacctgtcg tgccagcgga tccgcatctc 2460 

aattagtcag caaccatagt cccgccccta actccgccca tcccgcccct aactccgccc 2520 

agttccgccc attctccgcc ccatggctga ctaatttttt ttatttatgc agaggecgag 2580 

gccgcctcgg cctctgagct attccagaag tagtgaggag gcttttttgg aggectaggc. 2640 

ttttgcaaaa agctaacttg tttattgcag cttataatgg ttacaaataa agcaatagca 2700 

tcacaaattt cacaaataaa gcattttttt cactgcattc tagttgtggt ttgtccaaac 2760 

tcatcaatgt atcttatcat gtctggatcc getgeattaa tgaateggee aacgegeggg 2820 

gagaggeggt ttgcgtattg ggcgctcttc cgcttcctcg ctcactgact cgctgcgctc 2 88 0 

ggtcgttcgg ctgeggegag eggtatcage tcactcaaag geggtaatae ggttatccac 2 94 0 

agaatcaggg gataaegcag gaaagaacat gtgagcaaaa ggccagcaaa aggecaggaa 3000 

ccgtaaaaag gccgcgttgc tggcgttttt ccataggctc cgcccccctg acgagcatca 3060 

caaaaatcga cgctcaagtc agaggtggcg aaacccgaca ggactataaa gataccaggc 312 0 

gtttccccct ggaagctccc tcgtgcgctc tcctgttccg accctgccgc ttaceggata 3180 
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cctgtccgcc tttctccctt cgggaagcgt ggcgctfctcfc. caatgctcac gcfcgfcaggta 3240 

tctcagttcg gtgtaggtcg ttcgctccaa gctgggctgfc gtgcacgaac cccccgttca 3 3 00 

gcccgaccgc tgcgccttat ccggtaacta tcgtcttgag tccaacccgg taagacacga 3360 

cttatcgcca ctggcagcag ccactggtaa caggattagc agagcgaggt atgtaggcgg 3420 

tgctacagag tfccttgaagt ggtggcctaa ctacggctac actagaagga cagtatttgg 3480 

tatctgcgct ctgctgaagc cagtfcacctt cggaaaaaga gttggfcagct cttgatccgg 3540 

caaacaaacc accgctggta gcggtggttt ttttgtttgc aagcagcaga ttacgcgcag 3 600 

aaaaaaagga tctcaagaag atcctttgat cttttctacg gggtctgacg ctcagtggaa 3 660 

cgaaaactca cgttaaggga ttttggtcat gagattatca aaaaggatct tcacctagat 3 72 0 

ccttttaaafc taaaaatgaa gtfctfcaaatc aatctaaagt atatatgagt aaacttggtc 3780 

tgacagttac caatgcttaa tcagtgaggc acctatctca gcgafccfcgtc tatttcgttc 3 840 

atccatagtt gcctgactcc ccgtcgtgta gataactacg atacgggagg gcttaccatc 3 900 

tggccccagt gctgcaatga taccgcgaga cccacgctca ccggctecag atttatcagc 3960 

aataaaccag ccagccggaa gggccgagcg cagaagtggt cctgcaactt tatccgcctc 4 02 0 

catccagtct attaattgtt gccgggaagc tagagtaagt agttcgccag ttaatagttt 4 080 

gcgcaacgtt gttgccattg ctacaggcat cgtggtgtca cgctcgtcgt ttggtatggc 4140 

ttcattcagc tccggttccc aacgatcaag gcgagttaca tgatccccca tgttgtgcaa 4200 

aaaagcggtt agctccttcg gtcctccgat cgttgtcaga agtaagttgg ccgcagtgtt 4260 

atcactcatg gfcfcafcggcag cactgcataa ttctcttact gtcatgccat ccgtaagatg 4320 

cttttct.gtg acfcggfcgagt actcaaccaa gtcattctga gaatagtgta tgcggcgacc 4380 

gagttgctct tgcccggcgfc caatacggga taataccgcg ccacatagca gaactttaaa 4440 

agtgctcatc attggaaaac gttcttcggg gcgaaaactc tcaaggatct taccgctgtt 4500 

gagatccagt tcgatgtaac ccactcgtgc acccaactga tcttcagcat cttttacttt 4560 

caccagcgtt fcctgggfcgag caaaaacagg aaggcaaaat gccgcaaaaa agggaataag 4 620 

ggcgacacgg aaatgttgaa tactcatact cttccttttt caatattatt gaagcattta 4680 

tcagggttat tgtctcatga gcggatacat atttgaatgt atttagaaaa ataaacaaat 4740 

aggggfctccg cgcacatttc cccgaaaagt gccacctg 4778 



<210> 71 
<211> 5510 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pCXeGFP plasmid 



<400> 71 

gtcgacattg attattgact agttattaat agtaatcaat tacggggtca ttagttcata 60 

gcccatatat ggagttccgc gttacataac ttacggtaaa tggcccgcct ggctgaccgc 120 

ccaacgaccc ccgcccattg acgtcaataa tgacgfcatgt tcccatagta acgccaatag 180 

ggactttcca ttgacgtcaa tgggtggact atttacggta aactgcccac ttggcagtac 240 

atcaagtgta tcatatgcca agtacgcccc ctattgacgt caatgacggt aaatggcccg 300 

cctggcatta tgcccagtac atgaccttat gggactttcc tacttggcag tacatctacg 360 

tattagtcat cgctattacc atgggtcgag gtgagcccca cgttctgctt cactctcccc 420 

atctcccccc cctccccacc cccaattttg tatttattta ttttttaatt attttgtgca 480 

gcgatggggg cggggggggg gggggcgcgc gccaggcggg gcggggeggg gcgaggggcg 540 

Sggcggggcg aggcggagag gtgcggcggc agccaatcag agcggcgcgc tccgaaagtt 6O0 

tccttttatg gcgaggcggc ggcggcggcg gccctataaa aagcgaagcg cgcggcgggc 660 

gggagtcgct gcgttgcctt cgccccgtgc cccgctccgc gccgcctcgc gccgcccgcc 720 

ccggctctga ctgaccgcgt tactcccaca ggtgagcggg cgggacggcc cttctcctcc 780 

gggctgtaat tagcgcttgg tttaatgacg gctcgtttct tttctgtggc fcgcgtgaaag 840 

ccttaaaggg ctccgggagg gccctttgtg cgggggggag cggctcgggg ggtgcgtgcg 90 0 

tgtgtgtgtg cgtggggagc gccgcgtgcg gcccgcgctg cccggcggct gtgagcgctg 960 

cgggcgcggc gcggggcttt gtgcgcfcccg cgtgtgcgcg aggggagcgc ggccgggggc 102 0 

ggtgccccgc ggtgcggggg ggctgcgagg ggaacaaagg ctgcgtgcgg ggtgtgtgcg 1080 

tgggggggtg agcagggggt gtgggcgcgg cggtcgggct gtaacccccc cctgcacccc 114 0 

cctccccgag ttgctgagca cggcccggct tcgggtgcgg ggctccgtgc ggggcgtggc 1200 

gcggggctcg ccgtgccggg cggggggtgg cggcaggtgg gggtgccggg cggggcgggg 126 0 

ccgcctcggg ccggggaggg ctcgggggag gggcgcggcg gccccggagc gccggcggct 132 0 

gtcgaggcgc ggcgagccgc agccattgcc fctttatggta atcgtgcgag agggcgcagg 1380 

gacttccttt gtcccaaatc tggcggagcc gaaatctggg aggcgccgcc gcaccccctc 144 0 

tagcgggcgc gggcgaagcg gtgcggcgcc ggcaggaagg aaatgggcgg ggagggcctt 150 0 

cgtgcgtcgc cgcgccgccg tccccttctc catctccagc cfccggggctg ccgcaggggg 1560 

acggctgcct tcggggggga cggggcaggg cggggttcgg cttcfcggcgt gtgaccggcg 162 0 

gctctagagc ctctgctaac catgttcatg ccttcttctt tttcctacag ctcctgggca 1680 

acgtgctggt fcgfctgfcgcfcg tctcatcatt ttggcaaaga attcgccacc atggtgagca 174 0 

a gggcgagga gctgttcacc ggggtggtgc ccatcctggt: cgagctggac ggcgacgtaa 1800 
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acggccacaa gttcagcgtg tccggcgagg gcgagggcga tgccacctac ggcaagctga 1860 

ccctgaagtt cafcctgcacc accggcaagc tgcccgtgcc ctggcccacc ctcgtgacca 1920 

ccctgaccta cggcgtgcag tgcttcagcc gctaccccga ccacatgaag cagcacgact 1980 

tcttcaagtc cgccatgccc gaaggctacg tccaggagcg caccatcttc ttcaaggacg 2 040 

acggcaacta caagacccgc gccgaggtga agttcgaggg cgacaccctg gtgaaccgca 2100 

tcgagctgaa gggcatcgac ttcaaggagg acggcaacat cctggggcac aagctggagt 2160 

acaacfcacaa cagccacaac gtctatatca tggccgacaa gcagaagaac ggcatcaagg 2220 

tgaacttcaa gatccgccac aacatcgagg acggcagcgt gcagctcgcc gaccactacc 2280 

agcagaacac ccccatcggc gacggccccg tgctgctgcc cgacaaccac tacctgagca 2 340 

cccagtccgc cctgagcaaa gaccccaacg agaagcgcga tcacatggtc ctgctggagt 2400 

tcgtgaccgc cgccgggatc actctcggca tggacgagct gtacaagtaa gaattcactc 2460 

ctcaggtgca ggctgcctat cagaaggtgg tggctggtgt ggccaatgcc ctggctcaca 2 52 0 

aataccactg agatcttttt ccctctgcca aaaattatgg ggacatcatg aagccccttg 2 580 

agcatctgac ttctggctaa taaaggaaat ttattttcat tgcaatagtg tgttggaatt 2 640 

ttttgtgtct ctcactcgga aggacatatg ggagggcaaa tcatttaaaa catcagaatg 2 7O0 

agtatttggt ttagagtttg gcaacatatg ccatatgctg gctgccatga acaaaggtgg 276 0 

ctataaagag gtcatcagta tatgaaacag ccccctgctg tccattcctt attccataga 2 820 

aaagccttga cttgaggtta gatttttttt atattttgtt ttgtgttatt tttttcttta 2 880 

acatccctaa aattttcctt acatgtttta ctagccagat ttttcctcct ctcctgacta 2940 

ctcccagtca tagctgtccc tcttctctta tgaagatccc tcgacctgca gcccaagctt 3 0O0 

ggcgtaatca tggtcatagc tgtttcctgt gtgaaattgt tatccgctca caattccaca 3 060 

caacatacga gccggaagca taaagtgtaa agcctggggt gcctaatgag tgagctaact 3120 

cacafctaatt^ gcgttgcgct cactgcccgc tttccagtcg ggaaacctgt cgtgccagcg 3180 

gatccgcatc tcaattagtc agcaaccata gtcccgcccc taactccgcc catcccgccc 3 240 

ctaactccgc ccagttccgc ccattctccg ccccatggct gactaatttt ttttatttat 3 3O0 

gcagaggccg aggccgcctc ggcctctgag ctattccaga agtagtgagg aggctttttt 3360 

ggaggcctag gcttttgcaa aaagctaact tgtttattgc agcttataat ggttacaaat 342 0 

aaagcaatag catcacaaat ttcacaaata aagcattttt ttcactgcat tctagttgtg 3480 

gtttgtccaa actcatcaat gtatcttatc atgtctggat ccgctgcatt aatgaatcgg 3 540 

ccaacgcgcg gggagaggcg gtttgcgtat tgggcgctct tccgcttcct cgctcactga 3 6O0 

ctcgctgcgc tcggtcgttc ggctgcggcg agcggtatca gctcactcaa aggcggtaat 3 660 

acggttatcc acagaatcag gggataacgc aggaaagaac atgtgagcaa aaggccagca 3 720 

aaaggccagg aaccgtaaaa aggccgcgtt gctggcgttt ttccataggc tccgcccccc 3 780 

tgacgagcat cacaaaaatc gacgctcaag tcagaggtgg cgaaacccga caggactata 3 840 

aagataccag gcgtttcccc ctggaagctc cctcgtgcgc tctcctgttc cgaccctgcc 3 900 

gcttaccgga tacctgtccg cctttctccc ttcgggaagc gtggcgcttt ctcaatgctc 3 960 

acgctgtagg tatctcagtt cggtgtaggt cgttcgctcc aagctgggct gtgtgcacga 4 020 

accccccgtt cagcccgacc gctgcgcctt atccggtaac tatcgtcttg agtccaaccc 4 080 

ggtaagacac gacttatcgc cactggcagc agccactggt aacaggatta gcagagcgag 4140 

gtatgtaggc ggtgctacag agttcttgaa gtggtggcct aactacggct acactagaag 4200 

gacagtattt ggtatctgcg ctctgctgaa gccagttacc ttcggaaaaa gagttggtag 4 260 

ctcttgatcc ggcaaacaaa ccaccgctgg tagcggtggt ttttttgttt gcaagcagca 4320 

gattacgcgc agaaaaaaag gatctcaaga agatcctttg atcttttcta cggggtctga 4380 

cgctcagtgg aacgaaaact cacgttaagg gattttggtc atgagattat caaaaaggat 4440 

cttcacctag atccttttaa attaaaaatg aagttttaaa tcaatctaaa gtatatatga 4 500 

gtaaacttgg tctgacagtt accaatgctt aatcagtgag gcacctatct cagcgatctg 4 560 

tctatttcgt tcatccatag ttgcctgact ccccgtcgtg tagataacta cgatacggga 4 620 

gggcttacca tctggcccca gtgctgcaat gataccgcga gacccacgct caccggctcc 4 680 

agatttatca gcaataaacc agccagccgg aagggccgag cgcagaagtg gtcctgcaac 4740 

tttatccgcc tccatccagt ctattaattg ttgccgggaa gctagagtaa gtagttcgcc 4 80 0 

agttaatagt ttgcgcaacg ttgttgccat tgctacaggc atcgtggtgt cacgctcgtc 4 860 

gtttggtatg get teat tea gctccggttc ccaacgatca aggegagtta catgatcccc 4 920 

catgttgtgc aaaaaagegg ttagctcctt cggtcctccg atcgttgtca gaagtaagtt 4 980 

ggccgcagtg ttatcactca tggttatggc ageactgeat aattctctta ctgtcatgcc 504 0 

ateegtaaga tgcttttctg tgactggtga gtactcaacc aagtcattct gagaatagtg 5100 

tatgeggega ccgagttgct cttgcccggc gtcaataegg gataataccg cgccacatag 5160 

cagaacttta aaagtgctca tcattggaaa aegttctteg gggegaaaac tctcaaggat 522 0 

cttaccgctg ttgagatcca gttcgatgta acccactcgt gcacccaact gatcttcagc 5280 

atcttttact ttcaccagcg tttctgggtg agcaaaaaca ggaaggcaaa atgccgcaaa 534 0 

aaagggaata agggegacac ggaaatgttg aatactcata ctcttccttt ttcaatatta 540 0 

ttgaagcatt tatcagggtt attgtctcat gageggatae atatttgaat gtatttagaa 5460 

aaataaacaa ataggggttc cgcgcacatt tccccgaaaa gtgccacctg 5510 

<210> 72 
<211> 282 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> attp 



<400> 72 



ccttgcgcta 
ctgcatatgt 
tatattgata 
tataaaaaag 
aatcattatt 



atgctctgtt acaggtcact aataccatct aagtagttga ttcatagtga 
tgtgttttac agtattatgt agtctgtttt ttatgcaaaa tctaatttaa 
tttatatcat tttacgtttc tcgttcagct tttttatact aagttggcat 
cattgcttat caatttgttg caacgaacag gtcactatca gtcaaaataa 
tgatttcaat tttgtcccac tccctgcctc tg 



60 
120 
180 

240 
282 



<210> 73 

<211> 20 

<212> DNA 

<213> Artificial Sequence 
<220> 

<2 23> Primer 

<400> 73 

ggccccgtaa tgcagaagaa 20 

<210> 74 

<211> 32 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 



<210> 75 
<211> 40 
<212> DNA 

<213> Artificial Sequence 
<220> 

<2 23> Primer 
<400> 75 

agatctagag ccgccgctac aggaacaggt ggtggcggcc 40 

<210> 76 
<211> 37 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 5PacSV4 0 



<210> 77 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<2 23> Primer Ant i sense Zeo 
<400> 77 

tgaacagggt cacgtcgtcc 20 



<400> 74 

ggtttaaagt gcgctcctcc aagaacgtca tc 



32 



<400> 76 

ctgttaatta actgtggaat gtgtgtcagt tagggtg 



37 



<210> 78 
<211> 24 
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<212> DNA 

<213> Artificial Sequence 
<220> 

<22 3> Primer 5' HETS 
<400> 78 

gggccgaaac gatctcaacc tatt 24 

<210> 79 

<211> 19 

<212> DNA 

<213> Artificial Sequence 
<220> 

<22 3> Primer 3' HRTS 

<400> 79 

cgcagcggcc ctcctactc 19 

<210> 80 

<211> 29 

<212> DNA 

<213> Artificial Sequence 
<220> 

<2 2 3 > Primer 5BSD 

<400> 80 

accatgaaaa catttaacat ttctcaaca 29 

<210> 81 
<211> 29 
<212> DNA 

<213> Artificial Sequence 
<220> 

<22 3> Primer SV40polyA 
<400> 81 

tttatttgtg aaatttgtga tgctattgc 29 

<210> 82 

<211> 25 

<212> DNA 

<213> Artificial Sequence 
<220> 

<22 3> Primer 3BSP 

<400> 82 

ttaatttcgg gtatatttga gtgga 25 

<210> 83 
<211> 32 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer EPOSXBA 



<400> 83 

tatctagaat gggggtgcac gaatgtcctg cc 



32 



<210> 84 
<211> 32 
<212> DNA 
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<213> Artificial Sequence 
<220> 

<223> Primer EP03SBI 
<400> 84 

tacgtacgtc atctgtcccc tgtcctgcag gc 

<210> 85 
<211> 27 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer GENEP03BSI 
<400> 85 

cgtacgtcat ctgtcccctg tcctgca 

<210> 86 
<211> 28 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer GENEP05XBA 



32 



27 



<400> 86 

tctagaatgg gggtgcacgg tgagtact 

<210> 87 
<211> 4862 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pD2eGFP-lN plasmid from Clontech 



28 



<400> 87 

tagttattaa 

cgttacataa 

gacgtcaata 

atgggtggag 

aagtacgccc 

catgacctta 

catggtgatg 

atttccaagt 

ggactttcca 

acggtgggag 

ccggactcag 

gatccaccgg 

atcctggtcg 

gagggcgatg 

cccgtgccct 

taccccgacc 

caggagcgca 

ttcgagggcg 

ggcaacatcc 

gccgacaagc 

ggcagcgtgc 

ctgctgcccg 

aagcgcgatc 

gacgagctgt 

ggcacgctgc 

gcttctgcta 

ccacatttgt 

aacataaaat 



tagtaatcaa 
cttacggtaa 
atgacgtatg 
tatttacggt 
cctattgacg 
tgggactttc 
cggttttggc 
ctccacccca 
aaatgtcgta 
gtctatataa 
atctcgagct 
tcgccaccat 
agctggacgg 
ccacctacgg 
ggcccaccct 
acatgaagca 
ccatcttctt 
acaccctggt 

tggggcacaa 

agaagaacgg 
agctcgccga 
acaaccacta 
acatggtcct 
acaagaagct 
ccatgtcttg 
ggatcaatgt 
agaggt 1 1 1 a 
gaatgcaatt 



ttacggggtc 
atggcccgcc 
ttcccatagt 
aaactgccca 
tcaatgacgg 
ctacttggca 
agtacatcaa 
ttgacgtcaa 
acaactccgc 
gcagagctgg 
caagcttcga 
ggtgagcaag 
cgacgtaaac 
caagctgacc 
cgtgaccacc 
gcacgacttc 
caaggacgac 
gaaccgcatc 
gctggagtac 
catcaaggtg 
ccactaccag 
cctgagcacc 
gctggagttc 
tagccatggc 
tgcccaggag 
gtagatgcgc 
cttgctttaa 
gttgttgtta 



attagttcat 
tggctgaccg 
aacgccaata 
cttggcagta 
taaatggccc 
gtacatctac 
tgggcgtgga 
tgggagtttg 
cccattgacg 
tttagtgaac 
attctgcagt 
ggcgaggagc 
ggccacaagt 
ctgaagttca 
ctgacctacg 
ttcaagtccg 
ggcaactaca 
gagctgaagg 
aactacaaca 
aacttcaaga 
cagaacaccc 
cagtccgccc 
gtgaccgccg 
ttcccgccgg 
agcgggatgg 
ggccgcgact 
aaaacctccc 
acttgtttat 



agcccatata 
cccaacgacc 
gggactttcc 
catcaagtgt 
gcctggcatt 
gtattagtca 
tagcggtttg 
ttttggcacc 
caaatgggcg 
cgtcagatcc 
cgacggtacc 
tgttcaccgg 
tcagcgtgtc 
tctgcaccac 
gcgtgcagtg 
ccatgcccga 
agacccgcgc 
gcatcgactt 
gccacaacgt 
tccgccacaa 
ccatcggcga 
tgagcaaaga 
ccgggatcac 
aggtggagga 
accgtcaccc 
ctagatcata 
acacctcccc 
tgcagcttat 



tggagttccg 
cccgcccatt 
attgacgtca 
atcatatgcc 
atgcccagta 
tcgctattac 
actcacgggg 
aaaatcaacg 
gtaggcgtgt 
gctagcgcta 
gcgggcccgg 
ggtggtgccc 
cggcgagggc 
cggcaagctg 
cttcagccgc 
aggctacgtc 
cgaggtgaag 
caaggaggac 
ctatatcatg 
catcgaggac 
cggccccgtg 
ccccaacgag 
tctcggcatg 
gcaggatgat 
tgcagcctgt 
atcagccata 
ctgaacctga 
aatggttaca 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 
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aataaagcaa 
gtggtttgtc 
ttaaaattcg 
ggcaaaatcc 
tggaacaaga 
tatcagggcg 
tgccgtaaag 
aagccggcga 
ctggcaagtg 
ctacagggcg 
tttttctaaa 
caataatatt 
agttagggtg 
tcaattagtc 
aaagcatgca 
ccctaactcc 
atgcagaggc 
ttggaggcct 
attgaacaag 
tatgactggg 
caggggcgcc 
gacgaggc ag 
gacgttgtca 
ctcctgtcat 
cggctgcata 
gagcgagcac 
catcaggggc 
gaggatctcg 
cgcttttctg 
gcgttggcta 
gtgctttacg 
gagttcttct 
catcacgaga 
tccgggacgc 
accctagggg 
tgacggcaat 
gggttcggtc 
acgcccgcgt 
tcgcagccaa 
agattgattt 
atctcatgac 
aaaagatcaa 
caaaaaaacc 
ttccgaaggt 
cgtagttagg 
tcctgttacc 
gacgatagtt 
ccagcttgga 
gcgccacgct 
caggagagcg 
ggtttcgcca 
tatggaaaaa 
ctcacatgtt 
at 



tagcatcaca 
caaactcatc 
cgttaaattt 
cttataaatc 
gtccactatt 
atggcccact 
cactaaatcg 
acgtggcgag 
tagcggtcac 
cgtcaggtgg 
tacattcaaa 
gaaaaaggaa 
tggaaagtcc 
agcaaccagg 
tctcaattag 
gcccagttcc 
cgaggc cgc c 
aggcttttgc 
atggattgca 
cacaacagac 
cggttctttt 
cgcggctatc 
ctgaagcggg 
ctcaccttgc 
cgcttgatcc 
gtactcggat 
tcgcgccagc 
tcgtgaccca 
gattcatcga 
cccgtgatat 
gtatcgccgc 
gagcgggact 
tttcgattcc 
cggctggatg 
gaggctaact 
aaaaagacag 
ccagggctgg 
ttcttccttt 
cgtcggggcg 
aaaacttcat 
caaaatccct 
aggatcttct 
accgctacca 
aactggcttc 
ccaccacttc 
agtggctgct 
accggataag 
gcgaacgacc 
tcccgaaggg 
cacgagggag 
cctctgactt 
cgccagcaac 
ctttcctgcg 



aatttcacaa 
aatgtatctt 
ttgttaaatc 
aaaagaatag 
aaagaacgtg 
acgtgaacca 
gaaccctaaa 
aaaggaaggg 
gctgcgcgta 
cacttttcgg 
tatgtatccg 
gagtcctgag 
ccaggctccc 

tgtggaaagt 

tcagcaacca 
gcccattctc 
tcggcctctg 
aaagatcgat 
cgcaggttct 
aatcggctgc 
tgtcaagacc 
gtggctggcc 
aagggactgg 
tcctgccgag 
ggctacctgc 
ggaagccggt 
cgaactgttc 
tggcgatgcc 
ctgtggccgg 
tgctgaagag 
tcccgattcg 
ctggggttcg 
accgccgcct 
atcctccagc 
gaaacacgga 
aataaaacgc 
cactctgtcg 
tccccacccc 
gcaggccctg 
ttttaattta 
taacgtgagt 
tgagatcctt 
gcggtggttfc 
agcagagcgc 
aagaactctg 
gccagtggcg 
gcgcagcggt 
tacaccgaac 
agaaaggcgg 
cttccagggg 
gagcgtcgat 
gcggcctttt 
ttatcccctg 



ataaagcatt 
aaggcgtaaa 
agctcatttt 
accgagatag 
gactccaacg 
tcaccctaat 
gggagccccc 
aagaaagcga 
accaccacac 
ggaaatgtgc 
ctcatgagac 
gcggaaagaa 
cage aggcag 
ccccaggctc 
tagtcccgcc 
cgccccatgg 
agctattcca 
caagagacag 
ccggccgctt 
tetgatgecg 
gacctgtccg 
aegaegggeg 
ctgctattgg 
aaagtatcca 
ccattcgacc 
cttgtcgatc 
gccaggctca 
tgettgeega 
ctgggtgtgg 
cttggcggcg 
cagcgcatcg 
aaatgaccga 
tctatgaaag 
geggggatet 
aggagacaat 
acggtgttgg 
ataccccacc 
accccccaag 
ccatagcctc 
aaaggatcta 
tttcgttcca 
tttttctgcg 
gtttgccgga 
agataccaaa 
tagcaccgcc 
ataagtegtg 
egggefcgaac 
tgagatacct 
acaggtatcc 
gaaacgcctg 
ttttgtgatg 
tacggttcct 
attctgtgga 



tttttcactg 
ttgtaagcgt 
ttaaccaata 
ggttgagtgt 
teaaagggeg 
caagtttttt 
gatttagagc 
aaggagcggg 
ccgccgcgct 
gcggaacccc 
aataaccctg 
ccagctgtgg 
aagt a tgcaa 
cccagcaggc 
cctaactccg 
ctgactaatt 
gaagtagtga 
gatgaggatc 
gggtggagag 
ccgtgttccg 
gtgccctgaa 
ttccttgcgc 
gcgaagtgcc 
tcatggctga 
accaagegaa 
aggatgatct 
aggegagcat 
atatcatggt 
cggaccgcta 
aatgggctga 
ccttctatcg 
ccaagcgacg 
gttgggcttc 
catgetggag 
aceggaagga 
gtcgtttgtt 
gagaccccat 
ttcgggtgaa 
aggttactca 
ggtgaagatc 
ctgagegtea 
cgtaatctgc 
tcaagagcta 
tactgtcctt 
tacatacctc 
tettaceggg 

gggerggttcg 

acagegtgag 
ggtaagegge 
gtatctttat 
ctegtcaggg 
ggccttttgc 
taacegtatt 



cattctagtt 
taatattttg 
ggecgaaate 
tgttccagtt 
aaaaaccgtc 
ggggtcgagg 
ttgacgggga 
cgctagggcg 
taatgcgccg 
tatttgttta 
ataaatgett 
aatgtgtgtc 
ageatgeate 
agaagtatgc 
cccatcccgc 
ttttttattt 
ggaggctttt 
gtttcgcatg 
getattegge 
gctgtcagcg 
tgaactgcaa 
agctgtgctc 

ggggcaggat 

tgeaatgegg 
acatcgcatc 
ggacgaagag 
gcccgacggc 
ggaaaatggc 
tcaggacata 
ccgcttcctc 
ccttcttgac 
cccaacctgc 
ggaatcgttt 
ttcttcgccc 
acccgcgcta 
cataaacgeg 
tggggccaat 
ggcccagggc 
tatatacttt 
ctttttgata 
gaccccgtag 
tgcttgcaaa 
ccaactcttt 
ctagtgtagc 
getctgetaa 
ttggactcaa 
tgcacacagc 
ctatgagaaa 
agggt eggaa 
agtcctgtcg 
gggeggagee 
tggccttttg 
accgccatgc 



1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4862 



<210> 88 
<211> 5192 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pIRESpuro2 plasmid from Clontech 
<400> 88 

gaeggategg gagatctccc gatcccctat ggtcgactct cagtacaatc tgctctgatg 60 
ccgcatagtt aagecagtat ctgctccctg cttgtgtgtt ggaggtcget gagtagtgcg 12 0 
cgagcaaaat ttaagctaca acaaggcaag gcttgaccga caattgeatg aagaatctgc 18 0 
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ttagggttag 
gattattgac 
tggagttccg 
cccgcccafct 
attgacgtca 
atcatatgcc 
atgcccagta 
tcgctattac 
actcacgggg 
aaaatcaacg 
gtaggcgtgt 
ctgcttactg 
gagctcggat 
ctccggattc 
attcgctgtc 
cttctgcgct 
cggtgatgcc 
caagcttgag 
ctttgccfctt 
tagggcggcc 
ttggaataag 
ggcaatgtga 
tcccctctcg 
gaagcttctt 
cctggcgaca 
gcacaacccc 
tcaagcgtat 
gatctggggc 
ggccccccga 
aacccacaag 
gcgacgacgt 
cgcgccacac 
tcctcacgcg 
tggcggtctg 
cgcgcatggc 
t ggcg c cgc a 
accaccaggg 
gcgccggggfc 
ggctcggctt 
tgacccgcaa 
gcgcacgacc 
gcccccgagg 
ccagccatct 
cactgtcctt 
tattctgggg 
gcatgctggg 
gagtgcattc 
cgtcgacctc 
gttatccgct 
gtgcctaatg 
cgggaaacct 
tgcgtattgg 
tgcggcgagc 
ataacgcagg 
ccgcgttgct 
gctcaagtca 
gaagctccct 
ttctcccttc 
tgtaggtcgt 
gcgccttatc 
tggcagcagc 
tcttgaagtg 
tgctgaagcc 
ccgctggtag 
ctcaagaaga 
gttaagggat 
aaaaatgaag 



gcgttfctgcg 
tagttattaa 
cgttacataa 
gacgtcaata 
atgggtggac 
aagtacgccc 
catgacctta 
catggtgatg 
atttccaagt 
ggactttcca 
acggtgggag 
gcttatcgaa 
cgatatctgc 
gaa 1 1 cggat 
tgcgagggcc 
aagattgtca 
tttgagggtg 
gtgtggcagg 
ctctccacag 
aattccgccc 
gccggtgtgc 
gggcccggaa 
ccaaaggaat 
gaagacaaac 
ggtgcctctg 
stgtigccsLcgt. 
tcaacaaggg 
ctcggtgcac 
accacgggga 
gagacgacct 
cccccgggcc 
cgtcgacccg 
cgtcgggctc 
gaccacgccg 
cgagttgagc 
ccggcccaag 
caagggtctg 
gcccgccttc 
caccgtcacc 
gcccggtgcc 
ccatggctcc 
cccaccgact 
gttgtttgcc 
tcctaataaa 
ggtggggtgg 
gatgcggtgg 
fcagttgtggt 
tagctagagc 
cacaattcca 
agtgagctaa 
gtcgtgccag 
gcgctcttcc 
ggtatcagct 
aaagaacatg 
ggcgtttttc 
gaggtggcga 
cgrtgcgctct 
gggaagcgtg 
tcgctccaag 
cggtaactat 
cactggtaac 
gtggcctaac 
agttaccttc 

cggtggtttt 

tcctttgatc 
tttggtcatg 
ttttaaatca 



ctgcttcgcg 
tagtaatcaa 
cttacggtaa 
atgacgtatg 
tatttacggt 
cctattgacg 

tgggactttc 

cggttttggc 
ctccacccca 
aaatgtcgta 
gtctatataa 
attaatacga 
ggcctagcta 
ccgcggccgc 
agctgttggg 
gtttccaaaa 
gccgcgtcca 
cttgagatct 
gtgtccactc 
ctctccctcc 
gtttgtctat 
acctggccct 
gcaaggtctg 
aacgtctgta 
cggccaaaag 
tgtgagttgg 
gctgaaggat 
atgctttaca 
cgtggttttc 
tccatgaccg 
gtacgcaccc 
gaccgccaca 
gacatcggca 
gagagcg t eg 
ggttcccggc 
gagcccgcgt 
ggcagcgccg 
ctggagacct 
gccgacgtcg 
tgacgcccgc 
gaccgaagcc 
ctagagctcg 
cctcccccgt 
atgaggaaafc 
ggcaggacag 
gctctatggc 
ttgtccaaac 
ttggcgtaat 
cacaacatac 
ctcacattaa 
ctgcattaat 
gcttcctcgc 
cactcaaagg 
tgagcaaaag 
cataggctcc 
aacccgacag 
cctgttccga 
gcgctttctc 
cfcgggctgtg 
cgtcttgagt 
aggattagca 
tacggctaca 
ggaaaaagag 
tttgtttgca 
ttttctaegg 
agattatcaa 
atctaaagta 



atgtacgggc 
ttacggggtc 
atggcccgcc 
ttcccatagt 
aaactgccca 
teaatgaegg 
ctacttggca 
agtacatcaa 
ttgacgtcaa 
acaactccgc 
gcagagctct 
ctcactatag 
gegcttaagg 
atagataact 
gtgagtactc 
acgaggagga 
tctggtcaga 
ggccatacac 
ccaggtccaa 
ccccccccta 
atgfcgatttfc 
gtcttcttga 
ttgaatgtcg 
gcgacccttt 
ccacgtgtat 
atagttgtgg 
geccagaagg 
tgtgtttagt 
ctttgaaaaa 
agtacaagee 
tcgccgccgc 
tegagegggt 
aggtgtgggt 
aageggggge 
tggccgcgca 
ggttcctggc 
tcgtgctccc 
ccgcgccccg 
agtgcccgaa 
cccacgaccc 
gacccgggcg 
ctgatcagcc 
gccttccttg 
tgeategcat 
caagggggag 
ttctgaggcg 
tcatcaatgt 
catggtcata 
gagceggaag 
ttgcgttgcg 
gaateggeca 
tcactgactc 
eggtaatacg 
gecagcaaaa 
gcccccctga 
gactataaag 
ccctgccgct 
aatgctcacg 
tgcacgaacc 
ccaacccggt 
gagegaggta 
ctagaaggac 
ttggtagctc 
agcagcagat 
ggtctgaege 
aaaggatctt 
tatatgagta 



cagatatacg 
attagttcat 
tggctgaccg 
aacgecaata 
cttggcagta 
taaatggccc 
gtacatctac 
tgggcgtgga 
tgggagtttg 
cccattgacg 
ctggctaact 
ggagacccaa 
cctgttaacc 
gatccagtgt 
cctctcaaaa 
tttgatattc 
aaagacaa t c 
ttgagtgaca 
ctgeaggteg 
acgttactgg 
ccaccatatt 
cgagcattcc 
tgaaggaagc 
geaggcageg 
aagatacacc 
aaagagtcaa 
taccccattg 
cgaggttaaa 
cacgatgata 
cacggtgcgc 
gttcgccgac 
caccgagctg 
cgcggacgac 
ggtgttcgcc 
gcaacagatg 
caccgtcggc 
cggagtggag 
caacctcccc 
ggaccgcgcg 
gcagcgcccg 
gccccgccga 
tcgactgtgc 
accctggaag 
tgtctgagta 
gattgggaag 
gaaagaacca 
atcttatcat 
gctgtttcct 
cataaagtgt 
ctcactgccc 
aegegegggg 
gctgcgctcg 
gttatccaca 
ggecaggaac 
cgagcatcac 
ataccaggcg 
taceggatae 
ctgtaggtat 
ccccgttcag 
aagacacgac 
tgtaggcggt 
agtatttggt 
ttgatcegge 
tacgegcaga 
tcagtggaac 
cacctagatc 
aacttggtct 



cgttgacatt 
ageccatata 
cccaacgacc 
gggactttcc 
catcaagtgt 
gectggcatt 
gt at tag tea 
tagcggtttg 
ttttggcacc 
caaatgggcg 
agagaaccca 
gcttggtacc 
ggtcgtacgt 
gctggaatta 
gcgggcatga 
acctggcccg 
tttttgttgt 
atgacatcca 
ageatgeate 
ccgaagccgc 
geegtctttt 
taggggtctt 
agttcctctg 
gaacccccca 
tgcaaaggcg 
atggctctcc 
tatgggatct 
aaaaegtcta 
agcttgccac 
ctcgccaccc 
taccccgcca 
caagaactct 
ggege cgegg 
gagateggee 
gaaggcctcc 
gtctcgcccg 
gcggccgagc 
ttctacgagc 
acctggtgca 
accgaaagga 
ccccgcaccc 
cttctagttg 
gtgccactcc 
ggtgtcattc 
acaatagcag 
gctggggctc 
gtctgtatac 
gtgtgaaatt 
aaagcctggg 
gctttccagt 
agaggeggtt 
gtcgttcggc 
gaatcagggg 
cgtaaaaagg 
aaaaatcgac 
tttccccctg 
ctgtccgcct 
etcagttegg 
cccgaccgct 
ttatcgccac 
gctacagagt 
atctgcgctc 
aaacaaacca 
aaaaaaggat 
gaaaactcac 
cttttaaatt 
gacagttacc 



240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

4020 

4080 

4140 

4200 
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aatgcttaat 
cctgactccc 
ctgcaatgat 
cagccggaag 
ttaattgttg 
ttgccattgc 
ccggttccca 
gctccttcgg 
ttatggcagc 
ctggtgagta 
gcccggcgtc 
ttggaaaacg 
cgatgtaacc 
ctgggtgagc 
aatgttgaat 
gtctcatgag 
gcacatttcc 



cagtgaggca 
cgtcgtgtag 
accgcgagac 
ggc cgagcgc 
ccgggaagct 
tacaggcatc 
acgatcaagg 
tcctccgatc 
actgcat aat 
ctcaaccaag 
aatacgggat 
ttcttcgggg 
cactcgtgca 
aaaaacagga 
actcatactc 
cggatacata 
ccgaaaagtg 



cctatctcag 
ataactacga 
ccacgctcac 
agaagtggtc 
agagtaagta 
gtggtgtcac 
cgagttacat 
gttgtcagaa 
tctcttactg 
tcattctgag 
aataccgcgc 
cgaaaactct 
cccaacfcgat 
aggcaaaatg 
ttcctttttc 
tttgaafcgta 
ccacctgacg 



cgatctgtct 
tacgggaggg 
cggctccaga 
ctgcaacttt 
gttcgccagt 
gctcgfccgtt 
gatcccccat 
gtaagttggc 
tcatgccatc 
aatagtgtat 
cacatagcag 
caaggatctt 
cttcagcatc 
ccgcaaaaaa 
aatattattg 
fcttagaaaaa 
tc 



atttcgttca 
cttaccatct 
tttatcagca 
atccgcctcc 
taafcagttfcg 
tggtatggct 
gttgtgcaaa 
cgcagtgtta 
cgtaagatgc 
gcggcgaccg 
aactttaaaa 
accgctgttg 
ttttactttc 
gggaataagg 
aagcatttat 
taaacaaata 



tccatagttg 
ggccccagtg 
ataaaccagc 
atccagtcta 
cgcaacgttg 
tcattcagct 
aaagcggtta 
tcactcatgg 
ttttctgtga 
agttgctctt 
gtgctcatca 
agatccagtt 
accagcgttt 
gcgacacgga 
cagggttatt 
ggggttccgc 



4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 
5100 
5160 
5192 



<210> 89 
<211> 11182 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pAgl Plasmid 



<400> 89 

catgccaacc 

atagtgcagt 

agtcctaagt 

gtfcfctagtcg 

agagcgccgc 

ccaaccaacg 

ccggcaccag 

acgttgtgac 

t tgccgagcg 

acaccaccac 

agcgttccct 

tgaagtttgg 

tcgaccagga 

ccctgtaccg 

gtgcctfcccg 

gccaagagga 

cgaagagatc 

ctcaaccgtg 

gccggccagc 

tgagtaaaac 

aatacgcaag 

aagacgacca 

ttagtcgatt 

ccgctaaccg 

cggcgcgact 

atcaaggcag 

accgccgacc 

gcggcctttg 

gcgctggccg 

ccaggcactg 

cgcgaggtcc 

aagagaaaat 

gcaaggctgc 

agttgccggc 

ttaccgagct 

atgagtagat 

accgacgccg 

tgggttgtct 

cggtcgcaaa 

gaagttgaag 



acagggttcc 
cggcttctga 
tacgcgacag 
cataaagtag 
cgctggcctg 
ggccgaactg 
gcgcgac cgc 
agtgaccagg 
catccaggag 
gccggccggc 
aatcatcgac 
cccccgccct 
aggccgcacc 
cgcacttgag 
tgaggacgca 
acaagcatga 
gaggcggaga 
cggctgcatg 
ttggccgctg 
agcttgcgtc 
gggaacgcat 
tcgcaaccca 
ccgatcccca 
ttgtcggcat 
tcgtagtgat 
ccgacttcgt 
tggtggagcfc 
tcgtgtcgcg 
ggtacgagct 
ccgccgccgg 
aggcgctggc 
gagcaaaagc 
aacgttggcc 
ggaggatcac 
gctatctgaa 
gaattttagc 
tggaatgccc 
gccggccctg 
ccatccggcc 
gccgcgcagg 



cctcgggatc 
cgttcagtgc 
gctgccgccc 
aatacttgcg 
ctgggctatg 
cacgcggccg 
ccggagctgg 
ctagaccgcc 
gccggcgcgg 
cgcatggtgt 
cgcacccgga 
accctcaccc 
gtgaaagagg 
cgcagcgagg 
ttgaccgagg 
aaccgcacca 
tgatcgcggc 
aaatcctggc 
aagaaaccga 
atgcggtcgc 
gaaggttatc 
tctagcccgc 
gggcagtgcc 
cgaccgcccg 
cgacggagcg 
gctgattccg 
ggttaagcag 
ggcgatcaaa 
gcccattctt 
cacaaccgtt 
cgctgaaatt 
acaaacacgc 
agcctggcag 
accaagctga 
tacatcgcgc 
ggc t aaagga 
catgtgtgga 
caatggcact 
cggtacaaat 
ccgcccagcg 



aaagtacttt 
agccgtcttc 
tgcccttttc 
actagaaccg 
cccgcgtcag 
gctgcaccaa 
ccaggatgct 
tggcccgcag 
gcctgcgtag 
tgaccgtgt t 
gcgggcgcga 
cggcacagat 
cggctgcact 
aagtgacgcc 
ccgacgccct 
ggacggccag 
cgggtacgtg 

cggtttgtct 
gcgccgccgt 
tgcgtatatg 
gctgtactta 
gccctgcaac 
cgcgattggg 
acgattgacc 
ccccaggcgg 
gtgcagccaa 
cgcattgagg 
ggcacgcgca 
gagt cccgt a 
cttgaatcag 
aaatcaaaac 
taagtgccgg 
acacgccagc 
agatgtacgc 
agctaccaga 
ggcggcatgg 
ggaacgggcg 
ggaaccccca 
cggcgcggcg 
gcaacgcatc 



gatccaaccc 
tgaaaacgac 
ctggcgtttt 
gagacattac 
caccgacgac 
gctgttttcc 
tgaccaccta 
cacccgcgac 
cctggcagag 
cgccggcatfc 
ggccgccaag 
cgcgcacgcc 
gcttggcgtg 
caccgaggcc 
ggcggccgcc 
gacgaaccgt 
ttcgagccgc 
gatgccaagc 
ctaaaaaggt 
atgcgatgag 
accagaaagg 
tcgccggggc 
cggccgtgcg 
gcgacgtgaa 
cggacttggc 
gcccttacga 
tcacggatgg 
t cggcggt ga 
tcacgcagcg 
aacccgaggg 
tcatttgagt 
ccgtccgagc 
catgaagcgg 
ggtacgccaa 
gtaaatgagc 
aaaatcaaga 
gttggccagg 
agcccgagga 
ctgggtgatg 
gaggcagaag 



ctccgctgct 
atgtcgcaca 
cttgtcgcgt 
gccatgaaca 
caggacttga 
gagaagatca 
cgccctggcg 
ctactggaca 
ccgtgggccg 
gccgagttcg 
gcccgaggcg 
cgcgagctga 
catcgctcga 
aggcggcgcg 
gagaatgaac 
ttttcattac 
ccgcgcacgt 
tggcggcctg 
gatgtgtatt 
taaataaaca 

cgggtcaggc 

cgatgttctg 
ggaagatcaa 
ggccatcggc 
tgtgtccgcg 
catatgggcc 
aaggctacaa 
ggttgccgag 
cgtgagcfcac 
cgacgctgcc 
taatgaggta 
gcacgcagca 
gtcaactttc 
ggcaagacca 
aaatgaataa 
acaaccaggc 
cgtaagcggc 
atcggcgtga 
acctggtgga 
cacgccccgg 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 
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tgaatcgtgg 
cggtgcgccg 
gatgctctat 
tctgtcgaag 
cgtagaggtt 
gatggcggtt 
gcccggccgc 
tggcggaaag 
tgccatgcag 
agccttgatt 
gat cgagcta 
gacggttcac 
ggcacgccgc 
cagtggcagc 
aaatgacctg 
catgcgctac 
gatgctaggg 
tagcacgtac 
cccaaagccg 
aggcgatttt 
ctgtgcataa 
gtcgctgcgc 
aaaaatggct 
actcgaccgc 
aaaacctctg 
ggagcagaca 
tgacccagtc 
gattgtactg 
ataccgcatc 
gctgcggcga 
ggataacgca 
ggccgcgt tg 
acgctcaagt 
tggaagctcc 
ctttctccct 
ggtgtaggtc 
ctgcgcctta 
actggcagca 
gttcttgaag 
tctgctgaag 
caccgctggt 
atctcaagaa 
acgttaaggg 
atattttatt 
ctgttcttcc 
gtccgccctg 
gatgttgctg 
ctttaaaaaa 
gcaatccaca 
taagctattc 
cgcatacagc 
gacgccatcg 
gacctttgga 
atcataggtg 
tcccaccagc 
tttttcgatc 
tcctcttttc 
aattcactgt 
ttttcaaagt 
caggcagcaa 
gtttcaaa.cc 
tctgccgcct 
cgagtggtga 
tatattgtgg 
taatgtactg 
gttttaggaa 
ggtttcttat 



caagcggccg 
tcgattagga 
gacgtgggca 
cgtgaccgac 
tccgcagggc 
tcccatctaa 
gtgttccgtc 
cagaaagacg 
cgtacgaaga 
agccgctaca 
gctgattgga 
cccgattact 
gccgcaggca 
gccggagagt 
ccggagtacg 
cgcaacctga 
caaattgccc 
attgggaacc 
tacattggga 
tccgcctaaa 
ctgtctggcc 
tccctacgcc 
ggcctacggc 
cggcgcccac 
acacatgcag 
agcccgtcag 
acgtagcgat 
agagtgcacc 
aggcgctctt 
gcggtatcag 
ggaaagaaca 
ctggcgtttt 
cagaggtggc 
ctcgtgcgct 
tcgggaagcg 
gttcgctcca 
tccggtaact 
gccactggta 
tggtggccta 
ccagttacct 
agcggtggtt 
gatcctttga 
attttggtca 
ttctcccaat 
ccgatatcct 
ccgcttctcc 
tctcccaggt 
tcatacagct 
tcggccagat 
gtatagggac 
tcgataatct 
gcctcactca 
acaggcagct 
gtccctttat 
ttatatacct 
agttttttca 
tacagtattt 
tccttgcatt 
tggcgtataa 
cgctctgtca 
cggcagctta 
tacaacggct 
ttttgtgccg 
tgtaaacaaa 
aattaacgcc 
ttagaaattt 
atgctcaaca 



ctgatcgaat 
agccgcccaa 
cccgcgatag 
gagctggcga 
cggccggcat 
ccgaatccat 
cacacgttgc 
acctggtaga 
aggccaagaa 
agatcgtaaa 
tgtaccgcga 
ttttgatcga 
aggcagaagc 
tcaagaagtt 
atttgaagga 
tcgagggcga 
tagcagggga 
caaagccgta 
accggtcaca 
actctttaaa 
agcgcacagc 
ccgccgcttc 
caggcaatct 
atcaaggcac 
ctcccggaga 
ggcgcgtcag 
agcggagtgt 
atatgcggtg 
ccgcttcctc 
ctcactcaaa 
tgtgagcaaa 
tccataggct 
gaaacccgac 
ctcctgttcc 
tggcgctttc 
agctgggctg 
atcgtcttga 
acaggattag 
act acggc t a 
tcggaaaaag 
tttttgtttg 
tcttttctac 
tgcattctag 
caggcttgat 
ccctgatcga 
caagatcaat 
cgccgtggga 
cgcgcggatc 
cgttattcag 
aatccgatat 
tttcagggct 
tgagcagat t 
ttccttccag 
accggctgtc 
tagcaggaga 
attccggtga 
aaagataccc 
ctaaaacctt 
catagtatcg 
tcgttacaat 
gttgccgttc 
ctcccgctga 
agctgccggt 
ttgacgctta 
gaattaattc 
tattgataga 
catgagcgaa 



ccgcaaagaa 
gggcgacgag 
tcgcagcatc 
ggtgatccgc 
ggccagtgtg 
gaaccgatac 
ggacgtactc 
aacctgcatt 
cggccgcctg 
gagcgaaacc 
gatcacagaa 
tcccggcatc 
cagatggttg 
ctgtttcacc 
ggaggcgggg 
agcatccgcc 
aaaaggtcga 
cattgggaac 
catgtaagtg 
acttattaaa 
cgaagagctg 
gcgtcggcct 
accagggcgc 
cctgcctcgc 
cggtcacagc 
cgggtgttgg 
atactggctt 
tgaaataccg 
gctcactgac 
ggcggtaata 
aggccagcaa 
ccgcccccct 
aggactataa 
gaccctgccg 
tcatagctca 
tgtgcacgaa 
gtccaacccg 
cagagcgagg 
cactagaagg 
agttggtagc 
caagcagcag 

ggggtctgac 

gtactaaaac 
ccccagtaag 
ccggacgcag 
aaagccactt 
aaagacaagt 
tttaaatgga 
taagtaatcc 
gtcgatggag 
ttgttcatct 
gctccagcca 
ccatagcatc 
cgtcattttt 
cattccttcc 
tattctcatt 
caagaagcta 
aaataccaga 
acggagccga 
caacatgcta 
ttccgaatag 
cgccgtcccg 
cggggagctg 
gacaacttaa 
gggggatctg 
agtattttac 
accctatagg 



tcccggcaac 

c aac c aga 1 1 

atggacgtgg 

tacgagcttc 

tgggattacg 

cgggaaggga 

aagttctgcc. 

cggttaaaca 

gtgacggtat 

gggcggccgg 

ggcaagaacc 

ggccgttttc 

1 1 caagacga 

gtgcgcaagc 

caggctggcc 

ggttcctaat 

aaaggtctct 

cggaacccgt 

actgatataa 

actcttaaaa 

caaaaagcgc 

atcgcggccg 

ggacaagccg 

gcgtttcggt 

ttgtctgtaa 

cgggtgtcgg 

aactatgcgg 

cacagatgcg 

tcgctgcgct 

cggttatcca 

aaggccagga 

gacgagca t c 

aga t ac cagg 

cttaccggat 

cgctgtaggt 

ccccccgttc 

gtaagacacg 

tatgtaggcg 

acagtatttg 

tcttgatccg 

attacgcgca 

gctcagtgga 

aattcatcca 

tcaaaaaata 

aaggcaatgt 

actttgccat 

tcctcttcgg 

gtgtcttctt 

aattcggcta 

tgaaagagcc 

tcatactctt 

tcatgccgtt 

atgtcctttt 

aaatataggt 

gtatctttta 

ttagccattt 

attataacaa 

aaacagcttt 

ttttgaaacc 

ccctccgcga 

catcggtaac 

gactgatggg 

ttggctggct 

taacacattg 

gattttagta 

aaatacaaat 

aaccctaatt 



cgccggcagc 
ttttcgttcc 
ccgttttccg 
cagacgggca 
acctggtact 
agggagacaa 
ggcgagccga 
ccacgcacgt 
ccgagggtga 
agtacatcga 
cggacgtgct 
tctaccgcct 
tctacgaacg 
tgat cgggt c 
cgat cc tagt 
gtacggagca 
ttcctgtgga 
acattgggaa 
aagagaaaaa 
cccgcctggc 
ctacccttcg 
ctggccgctc 
cgccgtcgcc 
gatgacggtg 
gcggatgccg 
ggcgcagcca 
catcagagca 
taaggagaaa 
cggtcgttcg 
cagaatcagg 
accgtaaaaa 
acaaaaatcg 
cgtttccccc 
acctgtccgc 
atctcagttc 
agcccgaccg 
acttatcgcc 
gtgctacaga 
gtatctgcgc 
gcaaacaaac 
gaaaaaaagg 
acgaaaactc 
gtaaaatata 
gctcgacata 
cataccactt 
ctttcacaaa 
gcttttccgt 
cccagttttc 
agcggctgtc 
tgatgcactc 
ccgagcaaag 
caaagtgcag 
cccgttccac 
tttcattttc 
cgcagcggta 
attatttcct 
gacgaactcc 
ttcaaagttg 
gcggtgatca 
gatcatccgt 
atgagcaaag 
ctgcctgtat 
ggtggcagga 
cggacgtttt 
ctggattttg 
acatactaag 
cccttatctg 



2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 
5100 
5160 
5220 
5280 
5340 
5400 
5460 
5520 
5580 
5640 
5700 
5760 
5820 
5880 
5940 
6000 
6060 
6120 
6180 
6240 
6300 
6360 
6420 
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ggaactactc 
ggacggggcg 
ccg tgc t tga 
atgcgcacgc 
gcctccaggg 
cggggggaga 
gggcccgcgt 
cgctcccgca 
aagttgaccg 
gcctcggtgg 
gagatagatt 
ttccttatat 
agtggagata 
cacgatgctc 
aacgatagcc 
tgtccttttg 
taccctttgt 
cttggagtag 
agacgtggtt 
gggaccactg 
tttgtaggtg 
atggaatccg 
gtcttctgag 
gfctggcaagc 
taatgcagct 
aatgtgagtt 
atgttgtgtg 
tacgaattcg 
gagtttggac 
gatgctattg 
gaactccagc 
tccgaagccc 
gtcctgctcc 
ccgcccccac 
cgtggacacg 
ggccagggtg 
gtcccggacc 
ggtccagaac 
caacttggcc 
gcaggaattc 
accaaagggc 
attgcccagc 
aatgccatca 
ccaaagatgg 
cttcaaagca 
•agaatatcaa 
taatatcggg 
cagtagaaaa 
ttcaagatgc 
tggaaaaaga 
ctgacgtaag 
aagttcattt 
tctctcgagc 
cgacgtctgt 
tctcggaggg 
t gcgggt aaa 
catcggccgc 
cctattgcat 
tgcccgctgt 
gccagacgag 
gtgatttcat 
acaccgtcag 
gccccgaagt 
atggccgcat 
aggtcgccaa 
acttcgagcg 
gcattggtct 



acacattatt 
gtaccggcag 
agccggccgc 
tcgggtcgtt 
acttcagcag 
cgtacacggt 
aggcgatgcc 
gacggacgag 
tgcttgtctc 
cacggcggat 
tgtagagaga 
agaggaaggt 
tcacatcaat 
ctcgtgggtg 
tttcctttat 
atgaagtgac 
tgaaaagtct 
acgagagtgt 
ggaacgtctt 
tcggcagagg 
ccaccttcct 
aggagg t 1 1 c 
actgtatctt 
tgctctagcc 
ggcacgacag 
agctcactca 
gaattgtgag 
agccttgact 
aaaccacaac 
ctttatttgt 
atgagatccc 
aacctttcat 
tcggccacga 
ggct get cgc 
acctccgacc 
ttgtccggca 
acaccggcga 
tcgaccgctc 
atggatccag 
gatcgacact 
tattgagact 
tatctgtcac 
ttgcgataaa 
acccccaccc 
agtggattga 
agatacagtc 
aaacctcctc 
ggaaggtggc 
ctctgccgac 
agaegttcca 
ggatgacgea 
catttggaga 
tttegcagat 
cgagaagttt 
cgaagaatct 
tagctgcgcc 
gctcccgatt 
ctcccgccgt 
tctacaaccg 
egggttegge 
atgegegatt 
tgcgtccgtc 
ccggcacctc 
aac agegg t c 
catcttcttc 
gaggcatccg 
tgaccaactc 



atggagaaac 
gctgaagtcc 
ccgcagcatg 
gggcagcccg 
gtgggtgtag 
cgactcggcc 
ggegaect eg 
gt cgtccgtc 
gatgtagtgg 
gtcggccggg 
gactggtgat 
ettgegaagg 
ccacttgctt 
ggggtccatc 
cgcaatgatg 
agatagctgg 
caatagccct 
cgtgctccac 
ctttttccac 
catcttgaac 
tttctactgt 
ccgatattac 
tgatattctt 
aatacgcaaa 
gtttcccgac 
ttaggcaccc 
eggataacaa 
agagggtcga 
tagaatgeag 
aaccattata 
cgcgctggag 
agaaggegge 
agtgcacgca 
egatcteggt 
acteggegta 
ccacctggtc 
agtcgtcctc 
cggcgacgtc 
atttegctea 
ctcgtctact 
tttcaacaaa 
ttcatcaaaa 
ggaaaggcta 
acgaggagca 
tgtgataaca 
tcagaagacc 
ggattccatt 
acctacaaat 
agtggtccca 
accacgtctt 
caatcccact 
ggacacgctg 
ceggggggge 
ctgatcgaaa 
cgtgctttca 
gatggtttct 
ccggaagtgc 
gcacagggtg 
gtcgeggagg 
ccattcggac 
gctgatcccc 
gcgcaggctc 
gtgeacgegg 
attgactgga 
tggaggccgt 
gagcttgeag 
tatcagagct 



tcgagtcaaa 
agetgecaga 
ccgcgggggg 
atgacagega 
agcgtggagc 
gtccagtcgt 
ccgtccacct 
cactcctgcg 
ttgacgatgg 
cgtcgttctg 
ttcagcgtgt 
atagtgggat 
tgaagacgtg 
tttgggacca 
gcatttgtag 
gcaatggaat 
ttggtcttct 
catgttatca 
gatgctcctc 
gatagecttt 
ccttttgatg 
cctttgttga 
ggagtagacg 
ccgcctctcc 
tggaaagegg 
caggctttac 
fcttcacacag 
eggtatacag 
tgaaaaaaat 
agetgeaata 
gatcatccag 

ggtggaatcg 

gttgccggcc 
catggccggc 
cagctcgtcc 
ctggaccgcg 
cacgaagtcc 
gcgcgcggtg 
agttagtata 
ccaagaatat 
gggtaatatc 
ggacagtaga 
tegttcaaga 
tcgtggaaaa 
tggtggagca 
aaagggctat 
gcccagctat 
gecatcattg 
aagatggacc 
caaagcaagt 
atccttcgca 
aaatcaccag 
aatgagatat 
agttcgacag 
gcttcgatgt 
acaaagatcg 
ttgacattgg 
teaegttgea 
ctatggatgc 
cgcaaggaat 
atgtgtatca 
tcgatgagct 
attteggetc 
gcgaggcgafc 

ggttggcttg 
gatcgccacg 
fcggttgacgg 



teteggtgae 
aacccacgtc 
catatccgag 
ccacgctctt 
ccagtcccgt 
aggcgttgcg 
eggegacgag 
gttcctgegg 
tgcagaccgc 
ggct cat ggt 
cctctccaaa 
tgtgcgtcat 
gttggaacgt 
ctgtcggcag 
gtgccacctt 
ccgaggaggt 
gagactgtat 
catcaatcca 
Stgggtgggg 
cctttatcgc 
aagtgacaga 
aaagtctcaa 
agagtgtcgt 
ccgcgcgttg 
gcagtgagcg 
actttatget 
gaaacagcta 
acatgataag 
gctttatttg 
aacaagttgg 
ccggcgtccc 
aaatctcgta 
gggtcgegea 
ccggaggcgt 
aggccgcgca 
ctgatgaaca 
egggagaace 
ageaceggaa 
aaaaagcagg 
caaagataca 
gggaaacctc 
aaaggaaggt 
tgcctctgcc 
agaagacg 1 1 
cgacactctc 
tgagactttt 
ctgtcacttc 
cgataaagga 
cccacccacg 
ggattgatgt 
agaccttcct 
tctctctcta 
gaaaaagect 
cgtctccgac 
aggagggegt 
ttatgtttat 
ggagtttagc 
agacctgcct 
gategctgeg 
eggtcaatae 
ctggcaaact 
gatgetttgg 
caacaatgtc 
gtteggggat 
tatggagcag 
actccgggcg 
caatttcgat 



gggc aggac c 
atgccagttc 
cgcctcgtgc 
gaagccctgt 
ccgctggtgg 
tgccttccag 
ccagggatag 
cteggtaegg 
cggcatgtcc 
agactcgaga 
tgaaatgaac 
cccttacgtc 
cttctttttc 
aggcatcttg 
ccttttctac 
ttcccgatat 
ctttgatatt 
ettgetttga 
gtccatcttt 
aatgatggca 
tagctgggca 
tagecctttg 
gctccaccat 
gecgattcat 
caaegcaatt 
tccggctcgt 
tgaccatgat 
atacattgat 
tgaaatttgt 
ggtgggcgaa 
ggaaaacgat 
gcacgtgtca 
gggegaaetc 
cccggaagtt 
cccacaccca 
gggtcacgtc 
cgagccggtc 
cggcactggt 
cttcaatcct 
gtctcagaag 
ctcggattcc 
ggcacctaca 
gacagtggtc 
ccaaccacgt 
gtctactcca 
caacaaaggg 
atcaaaagga 
aaggctatcg 
aggagc at eg 
gatatctcca 
ctatataagg 
•caaatctatc 
gaactcaccg 
ctgatgeage 
ggatatgtcc 
eggcactttg 
gagagectga 
gaaaccgaac 
gecgatctta 
actacatggc 
gtgatggacg 
gecgaggact 
ctgaeggaca 
tcccaatacg 
cagacgcgct 
tatatgetec 
gatgeagett 



6480 

6540 

6600 

6660 

6720 

6780 

6840 

6900 

6960 

7020 

7080 

7140 

7200 

7260 

7320 

7380 

7440 

7500 

7560 

7620 

7680 

7740 

7800 

7860 

7920 

7980 

8040 

8100 

8160 

8220 

8280 

8340 

8400 

8460 

8520 

8580 

8640 

8700 

8760 

8820 

8880 

8940 

9O00 

9O60 

9120 

9180 

9240 

9300 

9360 

9420 

9480 

9540 

9600 

9660 

9720 

9780 

9840 

9900 

9960 

1O020 

10080 

10140 

10200 

1O260 

10320 

10380 

10440 
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gggcgcaggg 
aaatcgcccg 
gtggaaaccg 
atctgtcgat 
ggaattaggg 
gtatttgtat 
agtactaaaa 
ggccgtcgtfc 
tgcagcacat 
ttcccaacag 
tgtcgtttcc 
cctaagagaa 
tccgttcgtc 



tcgatgcgac 
cagaagcgcg 
acgccccagc 
cgacaagctc 
ttcctatagg 
ttgtaaaata 
tccagatccc 
ttacaacgtc 
ccccctttcg 
ttgcgcagcc 
cgcctfccagt 
aagagcgttt 
catttgtatg 



gcaatcgtcc 
gccgtctgga 
actcgtccga 
gagtttctcc 
gtttcgctca 
cttctatcaa 
c cgaa 1 1 aat 
gfcgactggga 
ccagctggcg 
tgaatggcga 
ttaaactatc 
attagaataa 

tg 



gatccggagc 
ccgatggctg 
gggcaaagaa 
ataataatgt 
tgtgttgagc 
taaaatttct 
tcggcgttaa 
aaaccctggc 
taatagcgaa 
atgctagagc 
agtgtttgac 
cggatattta 



cgggactgtc 
tgtagaagta 
afcagagtaga 
gtgagtagtt 
atataagaaa 
aattcctaaa 
ttcagatcaa 
gttacccaac 
gaggcccgca 
agcttgagct 
aggatatatt 
aaagggcgtg 



gggcgtacac 
ctcgccgata 
tgccgaccgg 
cccagataag 
cccttagtat 
accaaaatcc 
gcttggcact 
ttaatcgcct 
ccgatcgccc 
tggatcagat 
ggcgggtaaa 
aaaaggttta 



10500 
10560 
10620 
10680 
10740 
10800 
10860 
10920 
10980 
11040 
11100 
11160 
11182 



<210> 90 
<211> 8428 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pCambia330O Plasmid 



<400> 90 

catgccaacc 

atagtgcagt 

agtcctaagt 

gfcfcttagtcg 

agagcgccgc 

ccaaccaacg 

ccggcaccag 

acgttgtgac 

ttgccgagcg 

acaccaccac 

agcgttccct 

fcgaagtttgg 

tcgaccagga 

ccctgtaccg 

gtgccttccg 

gccaagagga 

cgaagagatc 

ctcaaccgtg 

gccggccagc 

tgagtaaaac 

aatacgcaag 

aagacgacca 

ttagtcgatt 

ccgctaaccg 

cggcgcgact 

atcaaggcag 

accgccgacc 

gcggcctttg 

gcgctggccg 

ccaggcactg 

cgcgaggtcc 

aagagaaaafc 

gcaaggctgc 

agttgccggc 

ttaccgagct 

atgagtagat 

accgacgccg 

tgggttgtct 

cggtcgcaaa 

gaagttgaag 

tgaatcgtgg 

cggtgcgccg 

gatgcfcctat 

tctgtcgaag 



acagggttcc 
cggcttctga 
tacgcgacag 
cataaagtag 
cgctggcctg 
ggccgaactg 
gcgcgaccgc 
agtgaccagg 
catccaggag 
gccggccggc 
aatcatcgac 
cccccgccct 
aggccgcacc 
cgcacttgag 
tgaggacgca 
acaagcatga 
gaggcggaga 
cggctgcatg 
ttggccgctg 
agcttgcgtc 
gggaacgcat 
tcgcaaccca 
ccgatcccca 
ttgtcggcat 
tcgtagtgat 
ccgacttcgt 
tggtggagct 
tcgtgtcgcg 
ggtacgagct 
ccgccgccgg 
aggcgctggc 
gagcaaaagc 
aacgttggcc 
ggaggatcac 
gctatctgaa 
gaattttagc 
tggaafcgccc 
gccggccctg 
ccatccggcc 
gccgcgcagg 
caagcggccg 
tcgattagga 
gacgtgggca 
cgtgaccgac 



cctcgggatc 
cgttcagtgc 
gcfcgccgccc 
aatacttgcg 
ctgggctatg 
cacgcggccg 
ccggagctgg 
ctagaccgcc 
gccggcgcgg 
cgcatggtgt 
cgcacccgga 
accctcaccc 
gtgaaagagg 
cgcagcgagg 
ttgaccgagg 
aaccgcacca 
tgatcgcggc 
aaatcctggc 
aagaaaccga 
afcgcggtcgc 
gaaggttatc 
tctagcccgc 
gggcagtgcc 
cgaccgcccg 
cgacggagcg 
gctgattccg 
ggttaagcag 
ggcgatcaaa 
gcccattctt 
cacaaccgtt 
cgctgaaatt 
acaaacacgc 
agcctggcag 
accaagctga 
tacatcgcgc 
ggctaaagga 
catgtgtgga 
caatggcact 
cggtacaaat 
ccgcccagcg 
ctgatcgaat 
agocgcccaa 
cccgcgatag 
gagctggcga 



aaagtacttt 
agccgtcttc 
tgcccttttc 
actagaaccg 
cccgcgtcag 
gctgcaccaa 
ccaggatgct 
tggcccgcag 
gcctgcgtag 
tgaccgtgtt 
gcgggcgcga 
cggcacagat 
cggctgcact 
aagtgacgcc 
ccgacgccct 
ggacggccag 
cgggtacgtg 
cggtttgtct 
gcgccgccgt 
tgcgtatatg 
gctgtactta 
gccctgcaac 
cgcgattggg 
acgattgacc 
ccccaggcgg 
gtgcagccaa 
cgcattgagg 
ggcacgcgca 
gagtcccgta 
cttgaatcag 
aaatcaaaac 
taagtgccgg 
acacgccagc 
agatgtacgc 
agctaccaga 
ggcggcatgg 
ggaacgggcg 
ggaaccccca 
cggcgcggcg 
gcaacgcatc 
ccgcaaagaa 

gggcgacgag 

tcgcagcatc 
ggtgatccgc 



gatccaaccc 
tgaaaacgac 
ctggcgtttt 
gagacattac 
caccgacgac 
gctgttttcc 
tgaccaccta 
cacccgcgac 
cctggcagag 
cgccggcatt 
ggccgccaag 
cgcgcacgcc 
gcttggcgtg 
caccgaggcc 
ggcggccgcc 
gacgaaccgt 
ttcgagccgc 
gatgccaagc 
ctaaaaaggt 
atgcgatgag 
accagaaagg 
tcgccggggc 
cggccgtgcg 
gcgacgtgaa 
cggacttggc 
gcccttacga 
tcacggatgg 
tcggcggtga 
tcacgcagcg 
aacc cgaggg 
tcatttgagt 
ccgtccgagc 
catgaagcgg 
ggtacgccaa 
gtaaatgagc 
aaaatcaaga 
gttggccagg 
agcccgagga 
ctgggtgatg 
gaggcagaag 
tcccggcaac 
caaccagatt 
atggacgtgg 
tacgagcttc 



ctccgctgct 
atgtcgcaca 
cttgtcgcgt 
gccatgaaca 
caggacttga 
gagaagatca 
cgccctggcg 
ctactggaca 
ccgtgggccg 
gccgagttcg 
gcccgaggcg 
cgcgagctga 
catcgctcga 
aggcggcgcg 
gagaatgaac 
ttttcattac 
ccgcgcacgt 
tggcggcctg 
gatgtgtatt 
taaataaaca 
cgggtcaggc 
cgatgttctg 
ggaagatcaa 
ggccatcggc 
tgtgtccgcg 
catatgggcc 
aaggctacaa 
ggttgccgag 
cgtgagctac 
cgacgctgcc 
taatgaggta 
gcacgcagca 
gtcaactttc 
ggcaagacca 
aaatgaataa 
acaaccaggc 
cgtaagcggc 
atcggcgtga 
acctggtgga 
cacgccccgg 
cgccggcagc 
ttttcgttcc 
ccgttttccg 
cagacgggca 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

84 0 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 
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cgtagaggtt 
gatggcggfct 
gcccggccgc 
tggcggaaag 
tgccatgcag 
agccttgatt 
gatcgagcta 
gacggttcac 
ggcacgccgc 
cagtggcagc 
aaatgacctg 
catgcgctac 
gatgctaggg 
tagcacgtac 
cccaaagccg 
aggcgatttt 
ctgtgcataa 
gt cgctgcgc 
aaaaatggct 
actcgaccgc 
aaaacctctg 
ggagcagaca 
tgacccagtc 
gattgtactg 
ataccgcatc 
gctgcggcga 
ggataacgca 
ggccgcgttg 
acgctcaagt 
tggaagctcc 
ctttctccct 
ggtgtaggtc 
ctgcgcctta 
actggcagca 
gttcttgaag 
tctgctgaag 
caccgctggt 
atctcaagaa 
acgttaaggg 
atattttatt 
ctgttcttcc 
gtccgccctg 
gatgttgctg 
ctttaaaaaa 
gcaatccaca 
taagctattc 
cgcatacagc 
gacgccatcg 
gacctttgga 
atcataggtg 
tcccaccagc 
tttttcgatc 
tcctcttttc 
aattcactgt 
ttttcaaagt 
caggcagcaa 
gtttcaaacc 
tctgccgcct 
cgagtggtga 
tatattgtgg 
taatgtactg 
gttttaggaa 
ggtttcttat 
ggaactactc 
ggacggggcg 
ccgtgcttga 
atgcgcacgc 



tccgcagggc 
tcccatctaa 
gtgttccgtc 
cagaaagacg 
cgtacgaaga 
agccgctaca 
gctgattgga 
cccgattact 
gccgcaggca 
gccggagagt 
ccggagtacg 
cgcaacctga 
caaattgccc 
attgggaacc 
tacattggga 
tccgcctaaa 
ctgtctggcc 
tccctacgcc 
ggcctacggc 
cggcgcccac 
acacatgcag 
agcccgtcag 
acgtagcgat 
agagtgcacc 
aggcgctctt 
gcggtatcag 
ggaaagaaca 
ctggcgtttt 
cagaggtggc 
ctcgtgcgct 
tcgggaagcg 
gttcgctcca 
tccggtaact 
gccactggta 
tggtggccta 
ccagttacct 
agcggtggtt 
gatcctttga 
attttggtca 
ttctcccaat 
ccgatatcct 
ccgcttctcc 
tctcccaggt 
tcatacagct 
tcggccagat 
gtatagggac 
tcgataatct 
gcctcactca 
acaggcagct 
gtccctttat 
ttatatacct 
agttttttca 
tacagtattt 
tccttgcatt 
tggcgtataa 
cgctctgtca 
cggcagctta 
tacaacggct 
ttttgtgccg 
tgtaaacaaa 
aattaacgcc 
ttagaaattt 
atgctcaaca 
acacattatt 
gtaccggcag 
agccggccgc 
tcgggtcgtt 



cggccggcat 
ccgaatccat 
cacacgttgc 
acctggtaga 
aggccaagaa 
agatcgtaaa 
tgtaccgcga 
ttttgatcga 
aggcagaagc 
tcaagaagtt 
atttgaagga 
tcgagggcga 
t agcagggga 
caaagccgta 
accggtcaca 
actctttaaa 
agcgcacagc 
ccgccgcttc 
caggcaatct 
atcaaggcac 
ctcccggaga 
ggcgcgtcag 
agcggagtgt 
atatgcggtg 
ccgcttcctc 
ctcactcaaa 
tgtgagcaaa 
tccataggct 
gaaacccgac 
ctcctgttcc 
tggcgctttc 
agctgggctg 
atcgtcttga 
acaggattag 
actacggcta 
tcggaaaaag 
tttttgtttg 
tcttfctctac 
tgcattctag 
caggcttgat 
ccctgatcga 
caagatcaat 
cgccgtggga 
cgcgcggatc 
cgttattcag 
aatccgatat 
tttcagggct 
tgagcagatt 
ttccttccag 
accggctgtc 
tagcaggaga 
attccggtga 
aaagataccc 
ctaaaacctt 
catagtatcg 
tcgttacaat 
gttgccgttc 
ctcccgcfcga 
agctgccggt 
ttgacgctta 
gaattaattc 
tattgataga 
catgagcgaa 
atggagaaac 
gcfcgaagtcc 
ccgcagcatg 
gggcagcccg 



ggccagtgtg 
gaaccgatac 
ggacgtactc 
aacctgcatt 
cggccgcctg 
gagcgaaacc 
gatcacagaa 
tcccggcatc 
cagatggttg 
ctgtttcacc 
ggaggcgggg 
agcatccgcc 
aaaaggtcga 
cattgggaac 
catgtaagtg 
acttattaaa 
cgaagagc tg 
gcg t cggc c t 
accagggcgc 
cctgcctcgc 
cggtcacagc 
cgggtgttgg 
atactggctt 
tgaaataccg 
gctcactgac 
ggcggtaata 
aggccagcaa 
ccgcccccct 
aggactataa 
gaccctgccg 
teat age tea 
tgtgcacgaa 
gt ccaacccg 
cagagegagg 
cactagaagg 
ag 1 1 ggt age 
caageagcag 

ggggtctgac 

gtactaaaac 
ccccagtaag 
ccggacgcag 
aaagecaett 
aaagacaagt 
tttaaatgga 
taagtaatcc 
gtcgatggag 
ttgttcatct 
gctccagcca 
ccatagcatc 
cgtcattttt 
cattccttcc 
tattctcatt 
caagaagcta 
aaataccaga 
aeggagcega 
caacatgeta 
ttccgaatag 
cgccgtcccg 
eggggagctg 
gacaacttaa 

gggggatctg 

agtattttac 
accctatagg 
t cgagt c aaa 
agetgecaga 
ccgcgggggg 
atgacagega 



tgggattacg 

egggaaggga 
aagttcfcgcc 
eggttaaaca 
gtgaeggtat 
gggeggcegg 
ggcaagaacc 
ggccgttttc 
ttcaagacga 
gtgegcaage 
caggctggcc 
ggttcctaat 
aaaggtctct 
cggaacccgt 
actgatataa 
actcttaaaa 
caaaaagege 
atcgcggccg 
ggacaagccg 
gcgtttcggt 
ttgtctgtaa 
cgggtgtcgg 
aactatgegg 
cacagatgeg 
tcgctgcgct 
eggttatcca 
aaggccagga 
gacgagcatc 
agataccagg 
ettaceggat 
cgctgtaggt 
ccccccgttc 
gtaagacacg 
tatgtaggcg 
acagtatttg 
tcttgatccg 
attacgegea 
gctcagtgga 
aattcatcca 
tcaaaaaata 
aaggcaatgt 
actttgecat 
tcctcttcgg 
gtgtcttctt 
aatteggcta 
tgaaagagee 
tcatactctt 
teatgeegtt 
atgtcctttt 
aaatataggt 
gtatctttta 
ttagccattt 
attataacaa 
aaacagcttt 
ttttgaaacc 
ccctccgcga 
categgtaac 
gactgatggg 
ttggctggct 
taacacattg 
gattttagta 
aaatacaaat 
aaccctaatt 
teteggtgae 
aacccacgtc 
catatccgag 
ccacgctctt 



acctggtact 
agggagacaa 
ggegagcega 
ccacgcacgt 
ccgagggtga 
agtacatcga 
eggaegtget 
tctaccgcct 
tctacgaacg 
tgatcgggtc 
cgatcctagt 
gtaeggagea 
ttcctgtgga 
acattgggaa 
aagagaaaaa 
cccgcctggc 
ctacccttcg 
ctggccgctc 
cgccgtcgcc 
gatgaeggtg 
gcggatgccg 
ggcgcagcca 
catcagagca 
t aaggagaaa 
eggtegtteg 
cagaatcagg 
acegtaaaaa 
acaaaaatcg 
cgtttccccc 
acctgtccgc 
atctcagttc 
agcccgaccg 
acttatcgcc 
gtgetacaga 
gtatctgege 
gcaaacaaac 
gaaaaaaagg 
acgaaaactc 
gtaaaatata 
gctcgacata 
cataccactt 
ctttcacaaa 
gettttcegt 
cccagttttc 
agcggctgtc 
tgatgeaetc 
ccgagcaaag 
caaagtgcag 
cccgttccac 
tttcattttc 
cgcagcggta 
attatttcct 
gacgaactcc 
ttcaaagttg 
gcggtgatca 
gatcatccgt 
atgagcaaag 
ctgcctgtat 
ggtggcagga 
cggacgtttt 
ctggattttg 
acatactaag 
cccttatctg 
gggcaggacc 
atgccagttc 
cgcctcgtgc 
gaagccctgt 



2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 
5100 
5160 
5220 
5280 
5340 
5400 
5460 
5520 
5580 
5640 
5700 
5760 
5820 
5880 
5940 
6000 
6060 
6120 
6180 
6240 
6300 
6360 
6420 
6480 
6540 
6600 
6660 
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gcctccaggg 
cggggggaga 
gggcccgcgt 
cgctcccgca 
aagttgaccg 
gcctcggtgg 
gagatagatt 
ttccttafcat 
agtggagata 
cacgatgctc 
aacgatagcc 
tgtccttttg 
taccctttgt 
cttggagtag 
agacgtggtt 
gggaccactg 
tttgtaggfcg 
atggaatccg 
gtcttctgag 
gttggcaagc 
taatgcagct 
aatgtgagtt 
atgttgtgtg 
tacgaattcg 
ggcactggcc 
tcgccttgca 
tcgcccttcc 
tcagattgtc 
ggtaaaccta 
ggtttatccg 



acttcagcag 
cgtacacggt 
aggcgatgcc 
gacggacgag 
tgcttgtctc 
cacggcggat 
tgtagagaga 
agaggaaggt 
tcacatcaat 
ctcgtgggtg 
tttcctttat 
atgaagtgac 
tgaaaagtct 
acgagagtgt 
ggaacgtctt 
tcggcagagg 
ccaccttcct 
aggaggtttc 
actgtatctt 
tgctctagcc 
ggcacgacag 
agctcactca 
gaattgtgag 
agctcggtac 
gtcgttttac 
gcacatcccc 
caacagttgc 
gtttcccgcc 
agagaaaaga 
ttcgtccatt 



gtgggtgtag 
cgactcggcc 
ggcgacctcg 
gtcgtccgtc 
gatgtagtgg 
gtcggccggg 
gactggtgat 
cttgcgaagg 
ccacttgctt 
ggggtccatc 
cgcaatgatg 
agatagctgg 
caatagccct 
cgtgctccac 
ctttttccac 
catcttgaac 
tttctactgt 
ccgatattac 
tgatattctt 
aatacgcaaa 
gtttcccgac 
ttaggcaccc 
cggataacaa 
ccggggatcc 
aacgtcgtga 
ctttcgccag 
gcagcctgaa 
ttcagtttaa 
gcgtttatta 
tgtatgtg 



agcgtggagc 
gtccagtcgt 
ccgtccacct 
cactcctgcg 
ttgacgatgg 
cgtcgttctg 
ttcagcgtgt 
atagtgggat 
tgaagacgtg 
tttgggacca 
gcatttgtag 
gcaatggaat 
ttggtcttct 
catgttatca 
gatgctcctc 
gatagccttt 
ccttttgatg 
cctttgttga 
ggagtagacg 
ccgcctctcc 
tggaaagcgg 
caggctttac 
tttcacacag 
tctagagtcg 
ctgggaaaac 
ctggcgtaat 
tggcgaatgc 
actatcagtg 
gaataacgga 



ccagtcccgt 
aggcgttgcg 
cggcgacgag 
gttcctgcgg 
tgcagaccgc 
ggctcatggt 
cctctccaaa 
tgtgcgtcat 
gttggaacgt 
ctgtcggcag 
gtgccacctt 
cc gaggaggt 
gagactgt at 
catcaatcca 
gtgggtgggg 
cctttatcgc 
aagtgacaga 
aaagtctcaa 
agagtgtcgt 
ccgcgcgttg 
gcagtgagcg 
actttatgct 
gaaacagcta 
acctgcaggc 
cctggcgtta 
agcgaagagg 
tagagcagct 
tttgacagga 
tafcttaaaag 



ccgctggtgg 
tgccttccag 
ccagggatag 
ctcggtacgg 
cggcatgtcc 
agactcgaga 
tgaaatgaac 
cccttacgtc 
cttctttttc 
aggcatcttg 
ccttttctac 
ttcccgatat 
ctttgatatt 
cttgctttga 
gtccatcttt 
aatgatggca 
tagctgggca 
tagccctttg 
gctccaccat 
gccgattcat 
caacgcaatt 
tccggctcgt 
tgaccatgat 
atgcaagctt 
cccaacttaa 
cccgcaccga 
tgagcttgga 
tatattggcg 
ggcgtgaaaa 



6720 
6780 
6840 
6900 
6960 
7020 
7080 
7140 
7200 
7260 
7320 
7380 
7440 
7500 
7560 
7620 
7680 
7740 
7800 
7860 
7920 
7980 
8040 
8100 
8160 
8220 
8280 
8340 
8400 
8428 



<210> 91 
<211> 3438 
<212> DMA 

<213> Artificial Sequence 
<220> 

<223> pLIT38attBZeo Plasmid 



<400> 91 

tcgaccctct 

gtcgtgactg 

tcgccagctg 

gcctgaatgg 

gttaactacg 

tttctaaata 

ataatattga 

tt ttgcggca 

tgcfcgaagat 

gatccttgag 

gctatgtggc 

acactattct 

tggcatgaca 

caacttactt 

ggggga t c at 

cgacgagcgt 

tggcgaacta 

agttgcagga 

tggagccggt 

ctcccgtatc 

acagatcgct 

ctcatatata 

aagattgtat 

aatttttgtt 

aaatcaaaag 

ctattaaaga 

ccactacgtg 



agtcaaggcc 
ggaaaaccct 
gcgt aat age 
cgaa t ggege 
tcaggtggca 
cattcaaata 
aaaaggaaga 
ttttgectte 
cagttgggtg 
agttttcgcc 
geggtattat 
cagaatgact 
gtaagagaat 
ctgacaacga 
gtaactcgcc 
gacaccacga 
cttactctag 
ccacttctgc 
gagcgtgggt 
gtagttatct 
gagataggtg 
ctttagattg 
aagcaaatat 
aaatcagctc 
aatagecega 
acgtggactc 
aaccatcacc 



ttaagtgagt 
ggcgttaccc 
gaagaggece 
ttcgcttggt 
ettttegggg 
tgtatccget 
gtatgagtat 
ctgtttttgc 
cacgagtggg 
ccgaagaacg 
cccgtgttga 
tggttgagta 
tatgcagtgc 

teggaggace 

ttgatcgttg 
tgcctgtagc 
cttcccggca 
gctcggccct 
etcgeggtat 
acacgaeggg 
cctcactgat 
atttaccccg 
ttaaattgta 
attttttaac 
gatagggttg 
caaegtcaaa 
caaatcaagt 



egtattaegg 
aacttaatcg 
gcaccgatcg 
aataaagece 
aaatgtgcgc 
catgagacaa 
tcaacatttc 
tcacccagaa 
ttacatcgaa 
ttctccaatg 
cgccgggcaa 
ctcaccagtc 
tgccataacc 
gaaggagcta 
ggaaceggag 
aatggcaaca 
acaattaata 
tccggctggc 
cattgeagea 
gagtcaggca 
taagcattgg 
gttgataatc 
aacgttaata 
caataggecg 
agtgttgttc 
gggcgaaaaa 
tttttggggt 



actggccgtc 
ccttgcagca 
cccttcccaa 
getteggegg 
ggaaccccta 
taaccctgat 
cgtgtcgccc 
acgctggtga 
ctggatctca 
atgagcactt 
gagcaactcg 
acagaaaagc 
atgagtgata 
acegcttttt 
c t gaatgaag 
aegttgegea 
gactggatgg 
tggtttattg 
ctggggccag 
actatggatg 
taactgtcag 
agaaaagece 
ttttgttaaa 
aaatcggcaa 
cagtttggaa 
ccgtctatca 
cgaggtgccg 



gttttacaac 
catccccctt 
cagttgegea 
gctttttttt 
tttgtttatt 
aaatgettea 
ttattccctt 
aagtaaaaga 
acageggtaa 
ttaaagttct 
gtcgccgcat 
atettaegga 
acactgcggc 
tgcacaacat 
ccataccaaa 
aactattaac 
aggeggataa 
ctgataaatc 
atggtaagcc 
aacgaaatag 
accaagttta 
caaaaacagg 
attcgegtta 
aatcccttat 
caagagtcca 

gggcgatggc 

taaagcacta 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 
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aatcggaacc 
gaaaggaagg 
cgcfcgcgcgt 
atctaggtga 
ttccactgag 
ctgcgcgtaa 
ccggatcaag 
ccaaatactg 
ccgcctacat 
tcgtgtctta 
tgaacggggg 
tacctacagc 
tatccggtaa 
gcctggtatc 
tgatgctcgt 
ttcctggcct 
accccaggct 
acaatttcac 
ctagtggggc 
tgctttttta 
ccggtgctca 
ttctcccggg 
ttcatcagcg 
cgcggcctgg 
gcctccgggc 
cgcgacccgg 
cgagatttcg 
gacgccggct 
aacttgttta 
aataaagcat 
tatcatgtct 



ctaaagggag 
gaagaaagcg 
aaccaccaca 
agatcctttt 
cgtcagaccc 
tctgctgctt 
agctaccaac 
ttcttctagt 
acctcgctct 
ccgggttgga 
gttcgtgcac 
gtgagctatg 
gcggcagggt 
tttatagtcc 
caggggggcg 
tttgctggcc 
ttacacttta 
acaggaaaca 
ccgtgcaatt 
tactaacttg 
ccgcgcgcga 
acttcgtgga 
cggtccagga 
acgagctgta 
cggccatgac 
ccggcaactg 
attccaccgc 
ggatgatcct 
ttgcagctta 
ttttttcact 
gtataccg 



cccccgattfc 
aaaggagcgg 
cccgc cgcgc 
tgataatctc 
cgfcagaaaag 
gcaaacaaaa 
tctttttccg 
gtagccgtag 
gctaatcctg 
ctcaagacga 
acagcccagc 
agaaagcgcc 
cggaacagga 
tgtcgggttt 
gagcctatgg 
ttttgctcac 
tgcttccggc 
gctatgacca 
gaagccggct 
agcgaaatct 
cgtcgccgga 
ggacgacttc 
ccaggtggtg 
cgccgagtgg 
cgagatcggc 
cgtgcacttc 
cgccttctat 
ccagcgcggg 
taatggttac 
gcattctagt 
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agagcttgac 
gcgctagggc 
ttaatgcgcc 
atgaccaaaa 
atcaaaggat 
aaaccaccgc 
aaggtaactg 
ttaggccacc 
ttaccagtgg 
tagfctaccgg 
ttggagcgaa 
acgcttcccg 
gagcgcacga 
cgccacctct 
aaaaacgcca 
atgtaatgtg 
tcgtatgttg 
tgattacgcc 
ggcgccaagc 
ggatccatgg 
gcggtcgagt 
gccggtgtgg 
ccggacaaca 
tcggaggtcg 
gagcagccgt 
gtggccgagg 
gaaaggttgg 
gatctcatgc 
aaataaagca 
tgtggtttgt 



ggggaaagcg 
gctggcaagt 
gctacagggc 
tcccttaacg 
cttcttgaga 
fcaccagcggt 
gcttcagcag 
actfccaagaa 
ctgctgccag 
ataaggcgca 
cgacctacac 
aagggagaaa 
gggagcttcc 
gacttgagcg 
gcaacgcggc 
agttagctca 
tgtggaattg 
aagctacgta 
ttctctgcag 
ccaagttgac 
tctggaccga 
t c cgggacga 
ccctggcctg 
tgtccacgaa 
gggggcggga 
agcaggactg 
gcttcggaat 
tggagttctt 
atagcatcac 
ccaaactcat 



aacgtggcga 
gtagcggtca 
gcgtaaaagg 
tgagtfcfctcg 
tccttttttt 
ggtttgtttg 
agcg caga t a 
ctctgtagca 
tggcgataag 
gcggtcgggc 
cgaac tgaga 
ggcggacagg 
agggggaaac 
tcgatttttg 
cfctttfcacgg 
ctcattaggc 
tgagcggata 
atacgactca 
gattgaagcc 
cagtgccgtt 
ccggctcggg 
cgtgaccctg 
ggtgtgggtg 
cttccgggac 
gttcgccctg 
acacgtgcta 
cgttttccgg 
cgcccacccc 
aaatttcaca 
caatgtatct 



1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3438 



<210> 92 
<211> 10549 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pCambial302 Plasmid 
<300> 

<308> Genbank #AF234398 
<309> 2000-04-24 



<400> 92 

catggtagat 

tgaattagat 

tgcaacatac 

gtggccaaca 

tcatatgaag 

gaccatcttc 

agacaccctc 

cctcggccac 

gcaaaagaac 

gcaactcgct 

agacaaccat 

ccacatggtc 

atacaaagct 

ccgatcgttc 

cgatgattat 

gcatgacgtt 

acgcgataga 

ctatgttact 

cctaagagaa 

tccgttcgtc 

ttgatccaac 

tctgaaaacg 



ctgactagt a 
ggtgatgtta 
ggaaaactta 
cttgtcacta 
cggcacgact 
ttcaaggacg 
gtcaacagga 
aagttggaat 
ggcatcaaag 
gatcattatc 
taccfcgtcca 
cttcttgagt 
agccaccacc 
aaacatttgg 
catataafctt 
atttatgaga 
aaacaaaata 
agatcgggaa 
aagagcgttt 
catttgtatg 
ccctccgctg 
acatgtcgca 



aaggagaaga 
atgggcacaa 
cccttaaatt 
ctttctctta 
tcttcaagag 
acgggaac t a 
tcgagcttaa 
acaactacaa 
ccaacttcaa 
aacaaaatac 
cacaatctgc 
ttgtaacagc 
accaccacca 
caataaagtt 
ctgttgaatt 
tgggttttta 
tagcgcgcaa 
ttaaactatc 
attagaataa 
tgcatgccaa 
ctatagtgca 
caagtcctaa 



acttttcact 
attttctgtc 
tatttgcact 
tggtgttcaa 
cgccatgcct 
caagacacgt 
gggaatcgat 
ctcccacaac 
gacccgccac 
tccaattggc 
cctttcgaaa 
tgctgggatt 
cgtgtgaatt 
tcttaagatt 
acgttaagca 
tgattagagt 
actaggataa 
agtgtttgac 
cggatattta 
ccacagggtt 
gtcggcttct 
gttacgcgac 



ggagttgtcc 
agtggagagg 
actggaaaac 
tgcttttcaa 
gagggatacg 
gctgaagtca 
ttc aaggagg 
gtatacatca 
aacatcgaag 
gatggccctg 
gatcccaacg 
acacatggca 
ggtgaccagc 
gaatcctgtt 
tgtaataatt 
cccgcaatta 
at tat cgcgc 
aggatatatt 
aaagggcgtg 
cccctcggga 
gacgttcagt 
aggctgccgc 



caattcttgt 
gtgaaggtga 
tacctgttcc 
gatacccaga 
tgcaggagag 
agt 1 1 gaggg 
acggaaacat 
tggccgacaa 
acggcggcgt 
tccttttacc 
aaaagagaga 
tggatgaact 
tcgaatttcc 
gccggtcttg 
aacatgtaat 
tacatttaat 
gcggtgtcat 
ggcgggtaaa 
aaaaggttta 
tcaaagtact 
gcagccgtct 
cctgcccttt 



60 

120 

180 

240 

300 

360 

420 

480 

54 0 

600 

660 

72 0 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 
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tcctggcgtt ttcttgtcgc gtgttttagt 
cggagacatt acgccatgaa caagagcgcc 
agcaccgacg accaggactt gaccaaccaa 
aagctgtttt ccgagaagat caccggcacc 
cttgaccacc tacgccctgg cgacgttgtg 
agcacccgcg acctactgga cattgccgag 
agcctggcag agccgtgggc cgacaccacc 
ttcgccggca ttgccgagtt cgagcgttcc 
gaggccgcca aggcccgagg cgtgaagttt 
atcgcgcacg cccgcgagct gatcgaccag 
ctgcttggcg tgcatcgctc gaccctgtac 
cccaccgagg ccaggcggcg cggtgccttc 
ctggcggccg ccgagaatga acgccaagag 
aggacgaacc gtttttcatt accgaagaga 
tgttcgagcc gcccgcgcac gtctcaaccg 
ctgatgccaa gctggcggcc tggccggcca 
gtctaaaaag gtgatgtgta tttgagtaaa 
tgatgcgatg agtaaataaa caaatacgca 
taaccagaaa ggcgggtcag gcaagacgac 
actcgccggg gccgatgttc tgttagtcga 
ggcggccgtg cgggaagatc aaccgctaac 
ccgcgacgtg aaggccatcg gccggcgcga 
ggcggacttg gctgtgtccg cgatcaaggc 
aagcccttac gacatatggg ccaccgccga 
ggtcacggat ggaaggctac aagcggcctt 
catcggcggt gaggttgccg aggcgctggc 
tatcacgcag cgcgtgagct acccaggcac 
agaacccgag ggcgacgctg cccgcgaggt 
actcatttga gttaatgagg taaagagaaa 
ggccgtccga gcgcacgcag cagcaaggct 
gccatgaagc gggfccaactt tcagttgccg 
gcggtacgcc aaggcaagac cattaccgag 
gagtaaatga gcaaatgaat aaatgagtag 
ggaaaatcaa gaacaaccag gcaccgacgc 
cggttggcca ggcgtaagcg gctgggttgt 
caagcccgag gaatcggcgt gacggtcgca 
cgctgggtga tgacctggtg gagaagttga 
tcgaggcaga agcacgcccc ggtgaatcgt 
aatcccggca accgccggca gccggtgcgc 
agcaaccaga ttttttcgtt ccgatgctct 
tcatggacgt ggccgttttc cgtctgtcga 
gctacgagct tccagacggg cacgtagagg 
tgtgggatta cgacctggta ctgatggcgg 
accgggaagg gaagggagac aagcccggcc 
tcaagttctg ccggcgagcc gatggcggaa 
ttcggttaaa caccacgcac gttgccatgc 
tggtgacggt atccgagggt gaagccttga 
ccgggcggcc ggagtacatc gagatcgagc 
aaggcaagaa cccggacgtg ctgacggttc 
tcggccgttt tctctaccgc ctggcacgcc 
tgttcaagac gatctacgaa cgcagtggca 
ccgtgcgcaa gctgatcggg tcaaatgacc 
ggcaggctgg cccgatccta gtcatgcgct 
ccggttccta atgtacggag cagatgctag 
gaaaaggtct ctttcctgtg gatagcacgt 
accggaaccc gtacattggg aacccaaagc 
tgactgatat aaaagagaaa aaaggcgatt 
aaactcttaa aacccgcctg gcctgtgcat 
tgcaaaaagc gcctaccctt cggtcgctgc 
ctatcgcggc cgctggccgc tcaaaaatgg 
gcggacaagc cgcgccgtcg ccactcgacc 
gcgcgtttcg gtgatgacgg tgaaaacctc 
gcttgtctgt aagcggatgc cgggagcaga 
ggcgggtgfcc ggggcgcagc ca tgacccag 
ttaactatgc ggcatcagag cagattgtac 
cgcacagatg cgtaaggaga aaataccgca 
actcgctgcg ctcggtcgtt cggctgcggc 
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cgcataaagt agaatacttg cgactagaac 13 80 
gccgctggcc tgctgggcta tgcccgcgtc 144 0 
cgggccgaac tgcacgcggc cggctgcacc 1500 
aggcgcgacc gcccggagct ggccaggatg 156 0 
acagtgacca ggctagaccg cctggcccgc 162 0 
cgcatccagg aggccggcgc gggcctgcgt 168 0 
acgccggccg gccgcatggt gttgaccgtg 174 0 
ctaatcatcg accgcacccg gagcgggcgc 180 0 
ggcccccgcc ctaccctcac cccggcacag 18 60 
gaaggccgca ccgtgaaaga ggcggctgca 192 0 
cgcgcacttg agcgcagcga ggaagtgacg 198 0 
cgtgaggacg cattgaccga ggccgacgcc 204 0 
gaacaagcat gaaaccgcac caggacggcc 210 0 
tcgaggcgga gatgatcgcg gccgggtacg 216 0 
tgcggctgca tgaaatcctg gccggtttgt 22 20 
gcttggccgc tgaagaaacc gagcgccgcc 22 80 
acagcttgcg tcatgcggtc gctgcgtata 234 0 
aggggaacgc atgaaggtta tcgctgtact 24 0 0 
catcgcaacc catctagccc gcgccctgca 24 6 0 
ttccgatccc cagggcagtg cccgcgattg 252 0 
cgttgtcggc atcgaccgcc cgacgattga 2580 
cttcgtagtg atcgacggag cgccccaggc 264 0 
agccgacttc gtgctgattc cggtgcagcc 27 0 0 
cctggtggag ctggttaagc agcgcattga 27 6 0 
tgtcgtgtcg cgggcgatca aaggcacgcg 282 0 
cgggtacgag ctgcccattc ttgagtcccg 2 880 
tgccgccgcc ggcacaaccg ttcttgaatc 294 0 
ccaggcgctg gccgctgaaa ttaaatcaaa 30 0 0 
atgagcaaaa gcacaaacac gctaagtgcc 3 0 60 
gcaacgttgg ccagcctggc agacacgcca 312 0 
gcggaggatc acaccaagct gaagatgtac 3180 
ctgctatctg aatacatcgc gcagctacca 324 0 
atgaatttta gcggctaaag gaggcggcat 33 00 
cgtggaatgc cccatgtgtg gaggaacggg 33 6 0 
ctgccggccc tgcaatggca ctggaacccc 342 0 
aaccatccgg cccggtacaa atcggcgcgg 34 80 
aggccgcgca ggccgcccag cggcaacgca 354 0 
ggcaagcggc cgctgatcga atccgcaaag 3 60 0 
cgtcgattag gaagccgccc aagggcgacg 366 0 
atgacgtggg cacccgcgat agtcgcagca 372 0 
agcgtgaccg acgagctggc gaggtgatcc 3780 
tttccgcagg gccggccggc atggccagtg 3 84 0 
tttcccatct aaccgaatcc atgaaccgat 3900 
gcgtgttccg tccacacgtt gcggacgtac 3 960 
agcagaaaga cgacctggta gaaacctgca 402 0 
agcgtacgaa gaaggccaag aacggccgcc 4 0 80 
ttagccgcta caagatcgta aagagcgaaa 414 0 
tagctgattg gatgtaccgc gagatcacag 42 0 0 
accccgatta ctttttgatc gatcccggca 4260 
gcgccgcagg caaggcagaa gccagatggt 43 2 0 
gcgccggaga gttcaagaag ttctgtttca 4380 
tgccggagta cgatttgaag gaggaggcgg 444 0 
accgcaacct gatcgagggc gaagcatccg 450 0 
ggcaaattgc cctagcaggg gaaaaaggtc 4560 
acattgggaa cccaaagccg tacattggga 4620 
cgtacattgg gaaccggtca cacatgtaag 46 80 
tttccgccta aaactcttta aaacttatta 474 0 
aactgtctgg ccagcgcaca gccgaagagc 4 800 
gctccctacg ccccgccgct tcgcgtcggc 4 860 
ctggcctacg gccaggcaat ctaccagggc 492 0 
gccggcgccc acatcaaggc accctgcctc 4980 
tgacacatgc agctcccgga gacggtcaca 504 0 
caagcccgtc agggcgcgtc agcgggtgtt 5100 
tcacgtagcg atagcggagt gtatactggc 5160 
tgagagtgca ccatatgcgg tgtgaaatac 52 2 0 
tcaggcgctc ttccgcttcc tcgctcactg 52 80 
gagcggtatc agctcactca aaggcggtaa 534 0 
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tacggttatc 
aaaaggccag 
ctgacgagca 
aaagatacca 
cgcttaccgg 
cacgctgtag 
aaccccccgt 
cggtaagaca 
ggtatgtagg 
ggacagtatt 
gctcttgatc 
agattacgcg 
acgctcagtg 
acaattcatc 
agtcaaaaaa 
agaaggcaat 
ttactttgcc 
gttcctcttc 
gagtgtcfcfcc 
ccaattcggc 
agtgaaagag 
cttcatactc 
catcatgccg 
tcatgtcctt 
ttaaatatag 
ccgtatcttt 
ttttagccat 
taattataac 
gaaaacagct 
gattttgaaa 
taccctccgc 
agcatcggta 
cggactgafcg 
fcgttggctgg 
aataacacat 
tggattttag 
acaaatacaa 
ggaaccctaa 
gtcgatcgac 
gcgtcggfctt 
tctgcgggcg 
tcgaccctgc 
gtcaagacca 
cctccgctcg 
gatgttggcg 
tgttatgcgg 
ccggacttcg 
cgcactgacg 
gcatatgaaa 
cccgctcgtc 
tagaacagcg 
ggagatgcaa 
gagcgcggcc 
gctatttacc 
ttcgccctcc 
ctcgacagac 
gaaagctcga 
aatgaaatga 
atccctfcacg 
gtcttctttt 
agaggcatct 
ttccttttct 
gtttcccgat 
atctttgata 
cacttgcfctt 
gggtccatct 
gcaatgatgg 



cacagaatca 
gaaccgtaaa 
tcacaaaaat 
ggcgtttccc 
atacctgtcc 
gtatctcagt 
tcagcccgac 
cgacfctatcg 
cggtgctaca 
tggtatctgc 
cggcaaacaa 
cagaaaaaaa 
gaacgaaaac 
cagtaaaata 
tagctcgaca 
gtcataccac 
atctttcaca 
gggcttttcc 
ttcccagttt 
taagcggctg 
cctgatgcac 
ttccgagcaa 
ttcaaagtgc 
ttcccgttcc 
gttttcattt 
tacgcagcgg 
ttattatttc 
aagacgaact 
ttttcaaagt 
ccgcggtgat 
gagatcatcc 
acatgagcaa 
ggctgcctgt 
ctggtggcag 
tgcggacgtt 
tactggattt 
atacatacta 
ttcccttatc 
agatccggtc 
ccactatcgg 
atttgtgtac 
gcccaagctg 
atgcggagca 
aagtagcgcg 
acctcgtatt 
ccatfcgtccg 
gggcagtcct 
gtgtcgtcca 
tcacgccatg 
tggctaagat 
ggcagttcgg 
taggtcaggc 
gatgcaaagt 
cgcaggacafc 
gagagctgca 
gtcgcggtga 
gagagat aga 
acttccttat 
tcagtggaga 
tccacgatgc 
tgaacgatag 
actgfcccttfc 
attacccttt 
ttcttggagt 
gaagacgtgg 
ttgggaccac 
catttgtagg 



ggggataacg 
aaggccgcgt 
cgacgctcaa 
cctggaagct 
gcctttctcc 
tcggtgtagg 
cgctgcgcct 
ccactggcag 
gagttcttga 
gctctgctga 
accaccgctg 
ggat c t caag 
tcacgttaag 
taatatttta 
tactgttctt 
ttgtccgccc 
aagatgttgc 
gtctttaaaa 
tcgcaatcca 
tctaagctat 
tccgcataca 
aggacgccat 
aggacctttg 
acatcatagg 
tctcccacca 
tatttttcga 
cttcctcttt 
ccaattcact 
tgttttcaaa 
cacaggcagc 
gtgtttcaaa 
agtctgccgc 
atcgagtggt 
gatatattgt 
tttaafcgtac 
tggttttagg 
agggtttctt 
tgggaactac 
ggcatctact 
cgagtacttc 
gcccgacagt 
catcatcgaa 
tatacgcccg 
tctgctgctc 
gggaa t cacc 
tcaggacatt 
cggcccaaag 
tcacagtttg 
tagtgtattg 
cggccgcagc 
tttcaggcag 
tctcgctaaa 
gccgataaac 
atccacgccc 
tcaggtcgga 
gttcaggctt 
tttgtagaga 
atagaggaag 
tatcacatca 
tcctcgtggg 
cctttccttt 
tgafcgaagtg 
gttgaaaagt 
agacgagagt 
ttggaacgtc 
tgtcggcaga 
tgccaccttc 



caggaaagaa 
tgctggcgtt 
gtcagaggtg 
ccctcgtgcg 
cttcgggaag 
tcgttcgctc 
tatccggtaa 
cagccactgg 
agtggtggcc 
agccagttac 
gtagcggtgg 
aagatcctfcfc 
ggattttggt 
ttttctccca 
ccccgatatc 
tgccgcttct 
tgtctcccag 
aatcatacag 
catcggccag 
tcgtataggg 
gctcgataat 
cggcctcact 
gaacaggcag 
tggtcccttt 
gcttatatac 
tcagtttttt 
tctacagtat 
gttccttgca 
gttggcgtat 
aacgctctgt 
cccggcagct 
cttacaacgg 
gattttgtgc 
ggtgtaaaca 
tgaattaacg 
aattagaaat 
atatgctcaa 
tcacacatta 
ctatttcttt 
tacacagcca 
cccggctccg 
attgccgtca 
gagtcgtggc 
catacaagcc 
gaacatcgcc 
gttggagccg 
catcagctca 
ccagtgatac 
accgattcct 
gatcgcatcc 
gtcttgcaac 
ctccccaatg 
ataacgatct 
tcctacatcg 
gacgctgtcg 
tttcatatct 
gagactggtg 
gtcttgcgaa 
atccacttgc 
tgggggtcca 
atcgcaatga 
a c aga t age t 
ctcaatagcc 
gtcgtgctcc 
ttctttttcc 
ggcatcttga 
cttttctact 



catgtgagca 
tttccatagg 
gcgaaacccg 
ctctcctgtt 

cgtggcgctt 

caagctgggc 
ctatcgtctt 
taacaggatt 
taactacggc 
cttcggaaaa 
tttttttgtt 
gatcttttct 
catgcattct 
atcaggcttg 
ctccctgatc 
cccaagatca 
gt cgccgtgg 
ctcgcgcgga 
atcgttattc 
acaatccgat 
cttttcaggg 
catgagcaga 
ctttccttcc 
ataccggctg 
c 1 1 agcagga 
caattccggt 
ttaaagatac 
ttctaaaacc 
aacatagtat 
catcgttaca 
tagttgccgt 
ctctcccgct 
cgagctgccg 
aattgacgct 
ccgaattaat 
tttattgata 
cacatgagcg 
ttatggagaa 
gccctcggac 
tcggtccaga 
gat cggacga 
accaagctct 
gatcctgcaa 
aaccacggcc 
tcgctccagt 
aaatccgcgt 
tcgagagcct 
acatggggat 
tgcggtccga 
atagcctccg 
gtgacaccct 
tcaagcactt 
ttgtagaaac 
aagctgaaag 
aacttttcga 
cattgccccc 
atttcagcgt 
ggatagtggg 
tttgaagacg 
tctttgggac 
tggcatttgt 
gggcaatgga 
ctttggtctt 
accatgttat 
acgatgctcc 
acgatagcct 
gtccttttga 



aaaggccagc 
ctccgccccc 
acaggactat 
ccgaccctgc 
tctcatagct 
tgtgtgcacg 
gagtccaacc 
agcagagcga 
tacactagaa 
agagttggta 
tgcaagcagc 
acggggtctg 
aggtactaaa 
atccccagta 
gaccggacgc 
ataaagccac 
gaaaagacaa 
tctttaaatg 
agtaagtaat 
atgtcgatgg 
ctttgttcat 
ttgctccagc 
agccatagca 
tccgtcattt 
gacattcctt 
gatattctca 
cccaagaagc 
ttaaatacca 
cgacggagcc 
atcaacatgc 
tcttccgaat 
gacgccgtcc 
gtcggggagc 
tagacaactt 
tcgggggatc 
gaagtatttt 
aaaccctata 
actcgagctt 
gagtgctggg 
cggccgcgc fc 
ttgcgtcgca 
gatagagttg 
gctccggatg 
tccagaagaa 
caatgaccgc 
gcacgaggtg 
gcgcgacgga 
cagcaatcgc 
atgggccgaa 
cgaccggttg 
gtgcacggcg 
ccggaatcgg 
catcggcgca 
cacgagattc 
tcagaaactt 
cgggatctgc 
gtcctctcca 
attgtgcgtc 
tggttggaac 
cactgtcggc 
aggtgccacc 
atccgaggag 
ctgagactgt 
cacatcaatc 
tcgtgggtgg 
ttcctttatc 
tgaagtgaca 



5400 
5460 
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5580 
5640 
5700 
5760 
5820 
5880 
5940 
6000 
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6120 
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6240 
6300 
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6600 
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6900 
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7260 
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7440 
7500 
7560 
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7740 
7800 
7860 
7920 
7980 
8040 
8100 
8160 
8220 
8280 
8340 
8400 
8460 
8520 
8580 
8640 
8700 
8760 
8820 
8880 
8940 
9000 
9060 
9120 
9180 
9240 
9300 
9360 
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gafcagctggg 
aatagccctt 
gtgctccacc 
tggccgattc 
cgcaacgcaa 
cttccggctc 
tatgaccatg 
gcatgcaagc 
tacccaactt 
ggcccgcacc 
cttgagcttg 
tcaaatagag 
cttacgactc 
ctactccaaa 
acaaagggta 
tgtgaagata 
ggccatcgtt 
gagcatcgtg 
tatctccact 
tatataagga 



caatggaatc 
tggtcttcfcg 
atgttggcaa 
attaatgcag 
ttaatgtgag 
gtatgttgtg 
attacgaatt 
ttggcactgg 
aatcgccttg 
gatcgccctt 
gatcagattg 
gacctaacag 
aatgacaaga 
aatatcaaag 
atatccggaa 
gtggaaaagg 
gaagatgcct 
gaaaaagaag 
gacgtaaggg 
agttcatttc 



cgaggaggtt 
agactgtatc 
gctgctctag 
ctggcacgac 
ttagctcact 
tggaatfcgtg 
cgagctcggt 
ccgtcgtttt 
cagcacatcc 
cccaacagtt 
tcgtttcccg 
aactcgccgt 
agaaaatctt 
atacagtctc 
accfcccfccgg 
aaggtggctc 
ctgccgacag 
acgttccaac 
atgacgcaca 
atttggagag 



tcccgatatt 
tttgatattc 
ccaatacgca 
aggtttcccg 
cattaggcac 
agcggataac 
acccggggat 
acaacgtcgt 
ccctttcgcc 
gcgcagcctg 
ccttcagttt 
aaagactggc 
cgtcaacatg 
agaagaccaa 
attccattgc 
ctacaaatgc 
tggtcccaaa 
cacgtcttca 
atcccactat 
aacacggggg 



accctttgtt 
ttggagtaga 
aaccgcctct 
actggaaagc 
cccaggcttt 
aatttcacac 
cctctagagt 
gactgggaaa 
agctggcgta 
aatggcgaat 
agcttcatgg 
gaacagttca 
gtggagcacg 
agggcaa t tg 
ccagctatct 
catcattgcg 
gatggacccc 
aagcaagfcgg 
ccttcgcaag 
actcttgac 



gaaaagtctc 
cgagagtgtc 
ccccgcgcgt 
gggcagtgag 
acactttatg 
aggaaacagc 
cgacctgcag 
accctggcgt 
atagcgaaga 
gctagagcag 
agtcaaagat 
tacagagtct 
acacacttgt 
agacttttca 
gtcactttat 
ataaaggaaa 
cacccacgag 
attgatgtga 
acccttcctc 



<210> 93 
<211> 33 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> CaMV35SpolyA Primer 
<400> 93 

ctgaattaac gccgaattaa ttcgggggat 

<210> 94 
<211> 29 
<212> DNA 

<213> Artificial Sequence 
<220> 

«c2 23> CaMV35Spr Primer 
<400> 94 

ctagagcagc ttgccaacat ggtggagca 

<210> 95 

<211> 12592 

<212> DNA 

<213> Artificial Sequence 
<220> 

<2 23> pAg2 Plasmid 



ctg 



<40O> 95 

gtacgaagaa 

gccgctacaa 

ctgattggat 

ccgattactt 

ccgcaggcaa 

ccggagagtt 

cggagtacga 

gcaacctgat 

aaattgccct 

ttgggaaccc 

acattgggaa 

ccgcctaaaa 

tgtctggcca 

ccctacgccc 

gcctacggcc 



ggccaagaac 
gatcgtaaag 
gtaccgcgag 
tttgatcgat 
ggcagaagcc 
caagaagttc 
tttgaaggag 
cgagggcgaa 
agcaggggaa 
aaagccgtac 
ccggtcacac 
ctctttaaaa 
gcgcacagcc 
cgccgcttcg 
aggcaatcta 



ggccgcctgg 
agcgaaaccg 
atcacagaag 
cccggcatcg 
agatggttgt 
tgtttcaccg 
gaggcggggc 
gcatccgccg 
aaaggtcgaa 
attgggaacc 
atgtaagtga 
cttattaaaa 
gaagagctgc 
cgtcggccta 
ccagggcgcg 



tgacggtatc 
ggcggccgga 
gcaagaaccc 
gccgttttct 
tcaagacgat 
tgcgcaagct 
aggctggccc 
gttcctaatg 
aaggtctctt 
ggaacccgta 
ctgatataaa 
ctcttaaaac 
aaaaagcgcc 
tcgcggccgc 
gacaagccgc 



cgagggtgaa 
gtacatcgag 
ggacgtgctg 
ctaccgcctg 
ctacgaacgc 
gatcgggtca 
gatcctagtc 
tacggagcag 
tcctgtggat 
cattgggaac 
agagaaaaaa 
ccgcctggcc 
tacccttcgg 
tggccgctca 
gccgtcgcca 



gccttgatta 
atcgagctag 
acggttcacc 
gcacgccgcg 
agtggcagcg 
aatgacctgc 
atgcgctacc 
atgctagggc 
agcacgtaca 
ccaaagccgt 
ggcgattttt 
tgtgcataac 
tcgctgcgct 
aaaatggctg 
ctcgaccgcc 



9420 

9480 

9540 

9600 

9660 

9720 

9780 

9840 

9900 

9960 

10020 

10080 

10140 

10200 

10260 

10320 

10380 

10440 

10500 

10549 



33 



29 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 
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ggcgcccaca 
cacatgcagc 
gcccgtcagg 
cgtagcgata 
gagtgcacca 
ggcgctctfcc 
cggtatcagc 
gaaagaacat 
fcggcgttttt 
agaggtggcg 
tcgtgcgctc 
cgggaagcgt 
ttcgctccaa 
ccggtaacta 
ccactggtaa 

ggtggcctaa 

cagttacctt 
gcggtggttt 
atccttfcgat 
ttttggtcat 
tctcccaatc 
cgatatcctc 
cgcttctccc 
ctcccaggtc 
catacagctc 
cggccagafcc 
tatagggaca 
cgataatctt 
cctcactcat 
caggcagctt 
tccctttata 
tatatacctt 
gttttttcaa 
acagtattta 
ccttgcattc 
ggcgtataac 
gctctgtcat 
ggcagcttag 
acaacggctc 
tttgtgccga 
gtaaacaaat 
attaacgccg 
tagaaatttt 
tgctcaacac 
cacattatta 
taccggcagg 
gccggccgcc 
cgggtcgttg 
cttcagcagg 
gtacacggtc 
ggcgatgccg 
acggacgagg 
gcttgtctcg 
acggcggafcg 
gtagagagag 
gaggaaggtc 
cacatcaatc 
tcgtgggtgg 
ttcctttatc 
tgaagtgaca 
gaaaagtctc 
cgagagtgfcc 
gaacgtcttc 
cggcagaggc 
caccttcctt 
ggaggtttcc 
ctgtatcttt 



tcaaggcacc 
tcccggagac 
gcgcgtcagc 
gcggagtgta 
tatgcggtgfc 
cgcttcctcg 
tcactcaaag 
gtgagcaaaa 
ccataggctc 
aaacccgaca 
tcctgttccg 
ggcgctfctcfc 
gctgggctgt 
tcgtcttgag 
caggattagc 
ctacggctac 
cggaaaaaga 
ttttgtttgc 
cttttctacg 
gcattctagg 
aggcttgatc 
cctgatcgac 
aagatcaata 
gccgtgggaa 
gcgcggatct 
gttattcagt 
afcccgafcatg 
ttcagggctt 
gagcagattg 
tccttccagc 
ccggctgtcc 
agcaggagac 
ttccggtgat 
aagatacccc 
taaaacctta 
afcagfcafccga 
cgttacaatc 
ttgccgttct 
tcccgctgac 
gcfcgccggt c 
tgacgcttag 
aattaattcg 
attgatagaa 
atgagcgaaa 
tggagaaact 
ctgaagtcca 
cgcagcatgc 
ggcagcccga 

tgggtgtaga 
gactcggccg 
gcgacctcgc 
tcgtccgtcc 
atgtagtggt 
tcggccgggc 
actggtgatt 
ttgcgaagga 
cacttgcttt 
gggt ccatct 
gcaatgatgg 
gatagctggg 
aatagccctt 
gtgctccacc 
tttttccacg 
atcttgaacg 
ttctactgtc 
cgatattacc 
gatattcttg 



ctgcctcgcg 
ggtcacagct 

gggtgttggc 

tactggctta 
gaaataccgc 
ctcactgact 
geggtaafcac 
ggccagcaaa 
cgcccccctg 
ggactataaa 
accctgccgc 
catagctcac 
gtgcacgaac 
tccaacccgg 
agagcgaggt 
actagaagga 
gtfcggtagcfc 
aagcagcaga 
gggtctgacg 
tactaaaaca 
cccagtaagt 
cggacgcaga 
aagccactta 
aagacaag 1 1 
ttaaatggag 
aagtaatcca 
tcgatggagt 
tgtfccatctt 
ctccagccat 
catagcatca 
gtcattttta 
attccttccg 
attctcattt 
aagaagctaa 
aataccagaa 
cggagccgat 
aacatgctac 
tccgaatagc 
gccgtcccgg 
ggggagctgt 
acaacttaat 

ggggatctgg 

gtattttaca 
ccctatagga 
cgagtcaaat 
gctgccagaa 
cgcggggggc 
tgacagcgac 
gcgtggagcc 
tccagtcgta 
cgtccacctc 
actcctgcgg 
tgacgatggt 
gtcgttctgg 
tcagcgtgtc 
tagtgggatt 
gaagacgtgg 
ttgggaccac 
catttgtagg 
caatggaatc 
tggtcttctg 
atgttatcac 
atgctcctcg 
atagcctttc 
cttttgatga 
ctttgttgaa 
gagtagacga 



cgtttcggtg 
tgtctgtaag 

gggtgtcggg 

actatgcggc 
acagatgcgt 
cgctgcgctc 
ggttatccac 
aggccaggaa 
acgagcatca 
gataccaggc 
ttaccggata 
gctgtaggta 
cccccgttca 
taagacacga 
atgtaggcgg 
cagtatttgg 
cttgatccgg 
ttacgcgcag 
ctcagtggaa 
attcatccag 
caaaaaatag 
aggcaatgtc 
ctttgccatc 
cctcttcggg 
tgtcttcttc 
attcggctaa 
gaaagagcct 
catactcttc 
catgccgttc 
tgtccttttc 
aatataggtt 
tatcttttac 
tagccattta 
ttataacaag 
aacagctttt 
tttgaaaccg 
cctccgcgag 
atcggtaaca 
actgatgggc 
tggctggctg 
aacacattgc 
attttagtac 
aatacaaata 
accctaattc 
ctcggtgacg 
acccacgtca 
atatccgagc 
cacgctcttg 
cagtcccgtc 
ggcgttgcgt 
ggcgacgagc 
ttcctgcggc 
gcagaccgcc 
gctcatggta 
ctctccaaat 

gtgcgtcatc 

ttggaacgtc 
tgtcggcaga 
tgccaccttc 
cgaggaggtt 
agactgtatc 
atcaatccac 
tgggtggggg 
ctttatcgca 
agtgacagat 
aagtctcaat 
gagtgtcgtg 



atgacggtga 
cggatgccgg 
gcgcagccat 
atcagagcag 
aaggagaaaa 
ggtcgttcgg 
agaatcaggg 
ccgtaaaaag 
caaaaatcga 
gtttccccct 
cctgtccgcc 
tctcagttcg 
gcccgaccgc 
cttatcgcca 
tgctacagag 
tatctgcgct 
caaacaaacc 
aaaaaaagga 
cgaaaactca 
taaaatataa 
ctcgacatac 
ataccacttg 
tttcacaaag 
cttttccgtc 
ccagttttcg 
gcggctgtct 
gafcgcactcc 
cgagcaaagg 
aaagtgcagg 
ccgttccaca 
ttcattttct 
gcagcggtat 
ttatttcctt 
acgaactcca 
tcaaagttgt 
cggtgatcac 
atcatccgtg 
tgagcaaagt 
tgcctgtatc 
gtggcaggat 
ggacgttttt 
tggattttgg 
catactaagg 
ccttatctgg 
ggcaggaccg 
tgccagttcc 
gcctcgtgca 
aagccctgtg 
cgctggtggc 
gccttccagg 
caggga t age 
teggtaegga 
ggcafcgfcccg 
gactcgagag 
gaaatgaact 
ccttacgtca 
ttctttttcc 
ggcatcttga 
cttttctact 
tcccgatatt 
tttgatattc 
ttgctttgaa 
tccatctttg 
atgatggcat 
agctgggcaa 
agecctttgg 
ctccaccatg 



aaacctctga 
gagcagacaa 
gacccagtca 
attgtactga 
taccgcatca 
c tgeggegag 
gataaegcag 
gccgcgttgc 
cgctcaagtc 
ggaagctccc 
tttctccctt 
gtgtaggtcg 
tgegecttat 
ctggcagcag 
ttcttgaagt 
ctgetgaage 
accgctggta 
tctcaagaag 
cgttaaggga 
tattttattt 
tgttcttccc 
tccgccctgc 
atgttgctgt 
tttaaaaaat 
caatccacat 
aagctattcg 
gcatacagct 
acgccatcgg 
acctttggaa 
tcataggtgg 
cccaccagct 
ttttcgatca 
cctcttttct 
attcactgtt 
tttcaaagtt 
aggcagcaac 
tttcaaaccc 
ctgccgcctt 
gagtggtgat 
atattgtggt 
aatgt actga 
ttttaggaat 
gtttcttata 
gaactactca 
gaeggggegg 
cgtgcttgaa 
tgcgcacgct 
cctc caggga 

ggggggag ac 

ggcccgcgta 
gctcccgcag 
agttgaccgt 
cctcggtggc 
aga t agat 1 1 
tccttatata 
gtggagatat 
acgatgctcc 
aegatagect 
gtccttttga 
accctttgtt 
ttggagtaga 
gacgtggttg 
ggaccactgt 
ttgtaggtgc 
tggaatccga 
tcttctgaga 
ttggcaagct 
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gctctagcca 

gcacgacagg 

gctcactcat 

aattgtgagc 

gccttgacta 

aaccacaact 

tttatttgta 

tgagatcccc 

acctttcata 

cggccacgaa 

gctgctcgcc 

cctccgacca 

tgtccggcac 

caccggcgaa 

cgaccgctcc 

tggatccaga 

atcgacactc 

attgagactt 

atctgtcact 

tgcgataaag 

cccccaccca 

gtggattgat 

gatacagtct 

aacctcctcg 

gaaggtggca 

tctgccgaca 

gacgttccaa 

gatgacgcac, 

atttggagag 

ttcgcagatc 

gagaagtttc 

gaagaatctc 

agctgcgccg 

ctcccgattc 

tcccgccgtg 

ctacaaccgg 

gggttcggcc 

tgcgcgattg 

gcgtccgtcg 

cggcacctcg 

acagcggt ca 

atcttcttct 

aggcatccgg 

gaccaactct 

cgatgcgacg 

agaagcgcgg 

cgccccagca 

gacaagctcg 

tcctataggg 

tgtaaaatac 

ccagatcccc 

tacaacgtcg 

cccctttcgc 

tgcgcagcct 

gccttcagtt 

agaattaagg 

tggaactgac 

tgagctaagc 

at c age t age 

gtatccaatt 

atcgaattcc 

gagttgtccc 

gtggagaggg 

ctggaaaact 

gcttttcaag 

agggatacgt 

ctgaagtcaa 



ataegcaaac 
tttcccgact 
taggcacccc 
ggataacaat 
gagggtcgac 
agaatgeagt 
accattataa 
gcgctggagg 
gaaggcggcg 
gtgcacgcag 
gatcteggtc 
ctcggcgtac 
cacctggtcc 
gtcgtcctcc 
ggegaegteg 
tttcgctcaa 
tcgtctactc 
ttcaacaaag 
tcatcaaaag 
gaaaggctat 
cgaggagcat 
gtgataacat. 
cagaagacca 
gattccattg 
cctacaaatg 
gtggtcccaa 
ccacgtcttc 
aatcccacta 
gaeaegctga 
egggggggea 
tgatcgaaaa 
gtgctttcag 
atggtttcta 
eggaagtget 
cacagggtgt 
t cgeggagge 
cattcggacc 
ctgatcccca 
cgcaggctct 
tgcacgcgga 
ttgactggag 
ggaggccgtg 
agettgeagg 
atcagagctt 
caatcgtccg 
ccgtctggac 
ctcgtccgag 
agtttctcca 
tttegctcat 
ttctatcaat 
cgaattaatt 
tgactgggaa 
cagctggcgt 
gaatggcgaa 
tggggatcct 
gagtcaegtt 
agaacegcaa 
acataegtea 
aaatatttct 
agagtctcat 
cgcggccgcc 
aattcttgtt 
tgaaggtgat 
acctgtfcccg 
atacccagat 
gcaggagagg 
gtttgaggga 



cgcctctccc 
ggaaagcggg 
aggctttaca 
fctcacacagg 
ggtatacaga 
gaaaaaaatg 
getgeaataa 
atcatccagc 
gtggaatcga 
ttgccggccg 
atggccggcc 
agctcgtcca 
tggaccgcgc 
acgaagtccc 
cgcgcggtga 
gttagtataa 
caagaatatc 
ggtaatatcg 
gacagtagaa 
cgttcaagat 
cgtggaaaaa 
ggtggagcac 
aagggct at t 
cccagctatc 
ccatcattgc 
agatggaccc 
aaagcaagtg 
tccttcgcaa 
aatcaccagt 
atgagatatg 
gttcgacagc 
cttcgatgta 
caaagatcgt 
tgacattggg 
cacgttgcaa 
tatggatgcg 
gcaaggaat c 
tgtgtatcac 
cgatgagctg 
tttcggctcc 
egaggegatg 
gttggcttgt 
atcgccacga 
ggttgacggc 
atccggagcc 
cgatggctgt 
ggcaaagaaa 
taataatgtg 
gtgttgagca 
aaaatttcta 
cggcgttaat 
aaccctggcg 
aatagcgaag 
tgetagagea 
ctagactgaa 
atgacccccg 
cgttgaagga 
gaaaccatta 
tgtcaaaaat 
attcactctc 
at ggt agate 
gaattagatg 
gcaacatacg 
tggccaacac 
catatgaagc 
accatcttct 
gacaccctcg 



cgcgcgttgg 
cagtgagege 
etttatgett 
aaacagctat 
catgataaga 
ctttatttgt 
acaagttggg 
cggcgtcccg 
aatctegtag 
ggtcgcgcag 
eggaggegtc 
ggccgcgcac 
tgatgaacag 
gggagaaccc 
gcaccggaac 
aaaagcaggc 
aaagat acag 
ggaaacctcc 
aaggaaggtg 
gcctctgccg 
gaagacgttc 
gacactctcg 
gagacttttc 
tgtcacttca 
gataaaggaa 
ccacccacga 
gattgatgtg 
gaccttcctc 
ctctctctac 
aaaaagcctg 
gtctccgacc 

ggagggcgtg 
tatgtttatc 
gagtttagcg 
gacctgcctg 
ategctgegg 
ggtcaataca 
tggcaaactg 
atgctttggg 
aacaatgtcc 
tteggggatt 
atggagcagc 
ctccgggcgt 
aatttcgatg 
gggactgtcg 
g t agaag t ac 
tagagtagat 
tgagtagttc 
tataagaaac 
attcctaaaa 
tcagatcaag 
ttacccaact 
aggcccgcac 
gcttgagctt 
ggegggaaac 
ccgatgacgc 
gccactcagc 
ttgcgcgttc 
gctccactga 
aatccaaata 
tgactagtaa 
gtgatgttaa 
gaaaacttac 
ttgt cactac 
ggcacgactt 
tcaaggacga 
tcaacaggat 



ccgattcatt 
aacgeaatta 
ccggctcgta 
gaccatgatt 
tacattgatg 
gaaatttgtg 
gtgggcgaag 
gaaaacgatt 
cacgtgtcag 
ggcgaactcc 
ccggaagttc 
ccacacccag 
ggtcaegteg 
gagceggteg 
ggcactggtc 
ttcaatcctg 
tctcagaaga 
teggattcca 
gcacctacaa 
acagtggtcc 
caaccacgtc 
tctactccaa 
aacaaagggt 
tcaaaaggac 
aggctatcgt 
ggagcatcgt 
atatctccac 
tatataagga 
aaatctatct 
aactcaccgc 
tgatgeaget 
gatatgtcct 
ggcactttgc 
agagectgae 
aaaccgaact 
ccgatcttag 
ctacatggcg 
tgatggacga 
ccgaggactg 
tgacggacaa 
cccaatacga 
agacgegcta 
atatgetccg 
atgcagcttg 
ggegtacaca 
tegecgatag 
gccgaccgga 
ccagataagg 
ccttagtatg 
ccaaaatcca 
cttggcactg 
taategcett 
cgatcgccct 
ggatcagatt 
gacaatctga 
gggacaagee 
cgcgggtttc 
aaaagt cgee 
cgtfcccataa 
atctgcaccg 
aggagaagaa 
tgggcacaaa 
ccttaaattt 
tttctcttat 
cttcaagagc 
egggaactae 
cgagcttaag 



aatgcagctg 
atgtgagtta 
tgttgtgtgg 
acgaattcga 
agtttggaca 
atgctattgc 
aactccagca 
ccgaagccca 
tcctgctcct 
cgcccccacg 
gtggacacga 
gccagggtgt 
tcccggacca 
gtccagaact 
aacttggeca 
caggaattcg 
ccaaagggct 
fctgcccagct 
atgecatcat 
caaagatgga 
ttcaaagcaa 
gaatatcaaa 
aatatcggga 
ag t agaaaag 
teaagatgee 
ggaaaaagaa 
tgaegtaagg 
agttcatttc 
ctctcgagct 
gacgtctgtc 
cteggaggge 
gcgggtaaat 
atcggccgcg 
etattgeate 
gcccgctgtt 
ccagacgagc 
tgatttcata 
caccgtcagt 
ccccgaagtc 
tggcegcata 
ggtcgccaac 
ettcgagegg 
cattggtctt 
ggcgcagggt 
aatcgcccgc 
tggaaaccga 
tctgtcgatc 
gaa 1 1 agggt 
tatttgtatt 
gtactaaaat 
gccgtcgttt 
gcagcacatc 
tcccaacagt 
gtcgtttccc 
teatgagegg 
gttttacgtt 
tggagtttaa 
taaggtcact 
attcccctcg 
gatctcgaga 
cttttcactg 
ttttctgtca 
atttgeacta 
ggtgttcaat 
gccatgcctg 
aagacacgtg 
ggaatcgatt 



4980 
5040 
5100 
5160 
5220 
5280 
5340 
5400 
5460 
5520 
5580 
5640 
5700 
5760 
5820 
5880 
5940 
6000 
6060 
6120 
6180 
6240 
6300 
6360 
6420 
6480 
6540 
6600 
6660 
6720 
6780 
6840 
6900 
6960 
7020 
7080 
7140 
7200 
7260 
7320 
7380 
7440 
7500 
7560 
7620 
7680 
7740 
7800 
7860 
7920 
7980 
8040 
8100 
8160 
8220 
8280 
8340 
8400 
8460 
8520 
8580 
8640 
8700 
8760 
8820 
8880 
8940 
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tcaaggagga 
tatacatcat 
acatcgaaga 
atggccctgt 
atcccaacga 
cacatggcat 
gtgaccagct 
aatcctgttg 
gtaataatta 
ccgcaattat 
ttatcgcgcg 
ggatatattg 
aagggcgtga 
ccctcgggat 
acgttcagtg 
ggctgccgcc 
gaatacttgc 
gctgggctat 
gcacgcggcc 
cccggagctg 
gc t agac cgc 
ggccggcgcg 
ccgcatggtg 
ccgcacccgg 
taccctcacc 
cgtgaaagag 
gcgcagcgag 
attgaccgag 
aaaccgcacc 
atgatcgcgg 
gaaatcctgg 
gaagaaaccg 
catgcggtcg 
tgaaggttat 
atctagcccg 
agggcagtgc 
tcgaccgccc 
tcgacggagc 
tgctgattcc 
tggttaagca 
gggcgatcaa 
tgcccattct 
gcacaaccgt 
ccgctgaaat 
cacaaacacg 
cagcctggca 
caccaagctg 
atacatcgcg 
cggctaaagg 
ccatgtgtgg 
gcaatggcac 
ccggtacaaa 
gccgcccagc 
gctgatcgaa 
aagccgccca 
acccgcgata 
cgagctggcg 
ccggccggca 
accgaatcca 
ccacacgttg 
gacctggtag 



cggaaacatc 
ggccgacaag 
cggcggcgtg 
ccttttacca 
aaagagagac 
ggatgaacta 
cgaatttccc 
ccggtcttgc 
acatgtaatg 
acatttaata 
cggtgtcatc 
gcgggtaaac 
aaaggtttat 
caaagtactt 
cagccgtctt 
ctgccctttt 
gactagaacc 
gcccgcgtca 
ggctgcacca 
gccaggatgc 
ctggcccgca 
ggcctgcgta 
ttgaccgtgt 
agcgggcgcg 
ccggcacaga 
gcggctgcac 
gaagtgacgc 
gccgacgccc 
aggacggcca 
ccgggtacgt 
ccggtttgtc 
agcgccgccg 
ctgcgtatat 
cgctgtactt 
cgccctgcaa 
ccgcgattgg 
gacgattgac 
gccccaggcg 
ggtgcagcca 
gcgcattgag 
aggcacgcgc 
tgagtcccgt 
tcttgaatca 
taaatcaaaa 
ctaagtgccg 
gacacgccag 
aagatgtacg 
cagctaccag 
aggcggcatg 
aggaacgggc 
tggaaccccc 
tcggcgcggc 
ggcaacgcat 
tccgcaaaga 
agggcgacga 
gtcgcagcat 
aggtgatccg 
tggccagtgt 
tgaaccgata 
cggacgtact 
aaacctgcat 



ctcggccaca 
caaaagaacg 
caactcgctg 
gacaaccatt 
cacatggtcc 
tacaaagcta 
cgatcgttca 
gatgattatc 
catgacgtta 
cgcgatagaa 
tatgttacta 
ctaagagaaa 
ccgttcgtcc 
tgatccaacc 
ctgaaaacga 
cctggcgttt 
ggagacatta 
gcaccgacga 
agctgtttfcc 
ttgaccacct 
gcacccgcga 
gcctggcaga 
tcgccggcat 
aggccgccaa 
tcgcgcacgc 
tgcttggcgt 
ccaccgaggc 
tggcggccgc 
ggacgaaccg 
gttcgagccg 
tgatgccaag 
tctaaaaagg 
gatgcgatga 
aaccagaaag 
ctcgccgggg 
gcggccgtgc 
cgcgacgtga 
gcggacttgg 
agcccttacg 
gtcacggatg 
atcggcggtg 
at cacgcagc 
gaacccgagg 
ctcatttgag 
gccgtccgag 
ccatgaagcg 
cggtacgcca 
agtaaatgag 
gaaaatcaag 
ggttggccag 
aagcccgagg 
gctgggtgat 
cgaggcagaa 
at cccggcaa 
gcaaccagat 
catggacgtg 
ctacgagctt 
gtgggattac 
ccgggaaggg 
caagttctgc 
tcggttaaac 



agttggaata 
gcatcaaagc 
at cat tat ca 
acctgtccac 
ttcttgagtt 
gccaccacca 
aacatttggc 
atataatttc 
tttatgagat 
aacaaaatat 
gatcgggaat 
agagcgttta 
atttgtatgt 
cctccgctgc 
catgtcgcac 
tcttgtcgcg 
cgccatgaac 
ccaggacttg 
cgagaagatc 
acgccctggc 
cctactggac 
gccgtgggcc 
tgccgagttc 
ggcccgaggc 
ccgcgagctg 
gcatcgctcg 
caggcggcgc 
cgagaatgaa 
tttttcatta 
cccgcgcacg 
ctggcggcct 
tgatgtgtat 
gtaaataaac 
gcgggtcagg 
ccgatgttct 
gggaagatca 
aggccatcgg 
ctgtgtccgc 
acatatgggc 
gaaggctaca 
aggttgccga 
gcgtgagcta 
gcgacgctgc 
ttaatgaggt 
cgcacgcagc 
ggtcaacttt 
aggcaagacc 
caaatgaata 
aacaaccagg 
gcgtaagcgg 
aatcggcgtg 
gacctggtgg 
gcacgccccg 
ccgccggcag 
tttttcgttc 
gccgttttcc 
ccagacgggc 
gacctggtac 
aagggagaca 
cggcgagccg 
accacgcacg 



caactacaac 
caacttcaag 
acaaaatact 
acaatctgcc 
tgtaacagct 
ccaccaccac 
aataaagttt 
tgttgaatta 
gggtttttat 
agcgcgcaaa 
taaactatca 
ttagaataac 
gcatgccaac 
tatagtgcag 
aagtcctaag 
tgttttagtc 
aagagcgccg 
accaaccaac 
accggcacca 
gacgttgtga 
attgccgagc 
gacaccacca 
gagcgttccc 
gtgaagtttg 
atcgaccagg 
accctgtacc 
ggtgccttcc 
cgccaagagg 
ccgaagagat 
tctcaaccgt 
ggccggccag 
ttgagtaaaa 
aaatacgcaa 
caagacgacc 
gttagtcgat 
accgctaacc 
ccggcgcgac 
gatcaaggca 
caccgccgac 
agcggccttt 
ggcgctggcc 
cccaggcact 
ccgcgaggtc 
aaagagaaaa 
agcaaggctg 
cagttgccgg 
attaccgagc 
aatgagtaga 
caccgacgcc 
ctgggttgtc 
acggtcgcaa 
agaagttgaa 
gtgaatcgtg 

ccggtgcgcc 

cgatgctcta 
gtctgtcgaa 
acgtagaggt 
tgatggcggt 
agcccggccg 
atggcggaaa 
ttgccatgca 



tcccacaacg 
acccgccaca 
ccaattggcg 
ctttcgaaag 
gctgggatta 
gtgtgaattg 
cttaagattg 
cgttaagcat 
gattagagtc 
ctaggataaa 
gtgtttgaca 
ggatatttaa 
cacagggttc 
tcggcttctg 
ttacgcgaca 
gcataaagta 
ccgctggcct 
gggc cgaact 
ggcgcgaccg 
cagtgaccag 
gcatccagga 
cgccggccgg 
taatcatcga 
gcccccgccc 
aaggccgcac 
gcgcacttga 
gtgaggacgc 
aacaagcatg 
cgaggcggag 
gcggctgcat 
cttggccgct 
cagcttgcgt 
ggggaacgca 
atcgcaaccc 
tccgatcccc 
gttgtcggca 
ttcgtagtga 
gccgacttcg 
ctggtggagc 
gtcgtgtcgc 
gggtacgagc 
gccgccgccg 
caggcgctgg 
tgagcaaaag 
caacgttggc 
cggagga t ca 
tgctatctga 
tgaattttag 
gtggaatgcc 
tgccggccct 
accatccggc 
ggccgcgcag 
gcaagcggcc 
gt cgat t agg 
tgacgtgggc 
gcgtgaccga 
ttccgcaggg 
ttcccatcta 
cgtgttccgt 
gcagaaagac 
gc 



9000 

9060 

9120 

9180 

9240 

9300 

9360 

9420 

9480 

9540 

9600 

9660 

9720 

9780 

9840 

9900 

9960 

10020 

10080 

10140 

10200 

10260 

10320 

10380 

10440 

10500 

10560 

10620 

10680 

10740 

10800 

10860 

10920 

10980 

11040 

11100 

11160 

11220 

11280 

11340 

11400 

11460 

11520 

11580 

11640 

11700 

11760 

11820 

11880 

11940 

12000 

12060 

12120 

12180 

12240 

12300 

12360 

12420 

12480 

12540 

12592 



<210> 96 

<211> 3357 

<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> pGEMEasyNOS Plasmid 



<400> 96 

tatcactagt 

tggatgcata 

fcagcfcgtttc 

agcataaagt 

cgctcactgc 

caacgcgcgg 

tcgctgcgct 

cggttatcca 

aaggccagga 

gacgagcatc 

agataccagg 

cttaccggat 

cgctgtaggt 

ccccccgttc 

gtaagacacg 

tatgtaggcg 

acagtatttg 

tcttgatccg 

attacgcgca 

gctcagtgga 

ttcacctaga 

taaacttggt 

ctatttcgtt 

ggcttaccat 

gatttatcag 

ttatccgcct 

gttaatagtt 

fcttggtatgg 

atgttgtgca 

gccgcagtgt 

tccgtaagat 

atgcggcgac 

agaactttaa 

ttaccgctgt 

tcttttactt 

aagggaataa 

tgaagcattt 

aataaacaaa 

aataccgcac 

ttgttaaaat 

atcggcaaaa 

gtttggaaca 

gtcfcatcagg 

aggtgccgta 

ggaaagccgg 

gcgctggcaa 

ccgctacagg 

tgcgggcctc 

gttgggtaac 

aatacgactc 

gccgcgggaa 

gactctaatt 

atatttgcta 

gtatgtgctt 

ggttctgtca 

tgactccctt 



gaattcgcgg 
gcttgagtat 
c t gtgtgaaa 
gtaaagcctg 
ccgctttcca 
ggagaggcgg 
cggtcgttcg 
cagaatcagg 
accgtaaaaa 
acaaaaatcg 
cgtttccccc 
acctgtccgc 
atctcagttc 
agcccgaccg 
acttatcgcc 
gtgctacaga 
gtatctgcgc 
gcaaacaaac 
gaaaaaaagg 
acgaaaacfcc 
tccttttaaa 
ctgacagtta 
catccat agt 
ctggccccag 
caataaacca 
ccatccagtc 
tgcgcaacgt 
cttcattcag 
aaaaagcggt 
tatcactcat 
gcttttetgt 
cgagttgctc 
aagtgctcat 
tgagatccag 
tcaccagcgt 
gggcgacacg 
atcagggtta 
taggggttcc 
agatgcgtaa 
tcgcgttaaa 
tcccttataa 
agagtccact 
gcgatggccc 
aagcactaaa 
cgaacgtggc 
gtgtagcggt 
gcgcgtccat 
ttcgctatta 
gccagggttt 
actatagggc 
ttcgattctc 
ggataccgag 
gctgatagtg 
agctcattaa 
gttccaaacg 
aattctccgc 



ccgcctgcag 
tctatagtgt 
fctgttatccg 

gggtgcctaa 

gtcgggaaac 
tttgcgtatt 
gctgcggcga 
ggataacgca 
ggccgcgttg 
acgctcaagt 
tggaagctcc 
ctttctccct 
ggtgtaggtc 
ctgcgcctta 
actggcagca 
gttcttgaag 
tctgctgaag 
caccgctggt 
atctcaagaa 
acgttaaggg 
ttaaaaatga 
ccaatgctta 
tgcctgactc 
tgctgcaatg 
gccagccgga 
tattaattgt 
tgttgccatt 
ctccggttcc 
tagctccttc 
ggfctatggca 
gactggtgag 
ttgcccggcg 
cattggaaaa 
ttcgatgtaa 
ttctgggtga 
gaaatgttga 
ttgtctcatg 
gcgcacattt 
ggagaaaat a 
tttttgttaa 
atcaaaagaa 
attaaagaac 
actacgtgaa 
tcggaaccct 
gagaaaggaa 
cacgctgcgc 
tcgccattca 
cgccagctgg 
tcccagtcac 
gaattgggcc 
gagatccggt 
gggaatttat 
accttaggcg 
actccagaaa 
taaaacggct 
tcatgatcag 



gtcgaccata 
cacctaaata 
ctcacaattc 
tgagtgagct 
ctgtcgtgcc 
gggcgctctt 
gcggtatcag 
ggaaagaaca 
cfcggcgtttt 
cagaggtggc 
ctcgtgcgct 
tcgggaagcg 
gttcgctcca 
tccggtaact 
gccactggta 
tggtggccta 
ccagttacct 
agcggtggfct 
gatcctttga 
attttggtca 
agttttaaat 
atcagtgagg 
cccgtcgtgt 
ataccgcgag 
agggccgagc 
tgccgggaag 
gctacaggca 
caacgatcaa 
ggtcctccga 
gcactgcata 
tactcaacca 
tcaatacggg 
cgttcttcgg 
cccactcgtg 
gcaaaaacag 
atactcatac 
agcggataca 
ccccgaaaag 
ccgcatcagg 
atcagctcat 
tagaccgaga 
gtggactcca 
ccatcaccct 
aaagggagcc 
gggaagaaag 
gtaaccacca 
ggctgcgcaa 
cgaaaggggg 
gacgttgtaa 
cgacgfccgca 
gcagattatt 
ggaacgtcag 
acttttgaac 
cccgcggctg 
tgtcccgcgt 
attgtcgttt 



tgggagagct 

gcttggcgta 
cacacaacat 
aactcacatt 
agctgcatta 
ccgcttcctc 
ctcactcaaa 
tgtgagcaaa 
tccataggct 
gaaacccgac 
ctcctgttcc 
tggcgctttc 
agctgggctg 
atcgtcttga 
acaggattag 
actacggcta 
t cggaaaaag 
tttttgtttg 
tcttttctac 
tgagattatc 
caatctaaag 
cacctatctc 
agataactac 
acccacgctc 
gcagaagtgg 
ctagagtaag 
fccgfcggfcgtc 
ggcgagttac 
tcgttgtcag 
attctcttac 
agtcattctg 
ataataccgc 
ggcgaaaact 
cacccaactg 
gaaggcaaaa 
tcttcctttt 
tatttgaatg 
tgccacctga 
aaattgt aag 
tttttaacca 
tagggttgag 
acgtcaaagg 
aatcaagttt 
cccgatttag 
cgaaaggagc 
cacccgccgc 
ctgttgggaa 
atgtgctgca 
aacgacggcc 
tgctcccggc 
tggafc tgaga 
tggagcattt 
gcgcaataat 
agtggctcct 
catcggcggg 
cccgccttca 



cccaacgcgt 
atcatggtca 
acgagccgga 
aattgcgttg 
atgaatcggc 
gctcactgac 
ggcggtaata 
aggccagcaa 
ccgcccccct 
aggactataa 
gaccctgccg 
tcatagctca 
tgtgcacgaa 
gtccaacccg 
cagagcgagg 
cactagaaga 
agfctggtagc 
caagcagcag 

ggggtctgac 

aaaaaggatc 
tatatatgag 
agcgatctgt 
gatacgggag 
accggctcca 
tccfcgcaact 
tagttcgcca 
acgctcgtcg 
at gat ccccc 
aagtaagttg 
tgtcatgcca 
agaatagtgt 
gccacatagc 
ctcaaggatc 
atcttcagca 
tgccgcaaaa 
tcaatattat 
tatttagaaa 
tgcggtgtga 
cgttaatatt 
ataggccgaa 
tgttgttcca 
gcgaaaaacc 
tttggggtcg 
agcttgacgg 

gggcgctagg 

gcttaatgcg 
gggcgatcgg 
aggcgattaa 
agtgaattgt 
cgccatggcg 
gtgaatatga 
ttgacaagaa 
ggtttctgac 
tcaacgttgc 
ggtcataacg 
gtctaga 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

72 0 

780 

840 

900 

960 

1020 

1080 

114 0 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3357 



<210> 97 
<211> 10122 
<212> DNA 

<:213> Artificial Sequence 



<220> 
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<223> pl302NOS Plasmid 



<400> 97 

catggtagat 

tgaattagat 

tgcaacatac 

gtggccaaca 

tcatatgaag 

gaccatcttc 

agacaccctc 

cctcggccac 

gcaaaagaac 

gcaactcgct 

agacaaccat 

ccacatggtc 

atacaaagct 

ccgatcgttc 

cgatgattat 

gcatgacgtt 

acgcgataga 

ctatgttact 

cctaagagaa 

tccgttcgtc 

ttgatccaac 

tctgaaaacg 

tcctggcgtt 

cggagacatt 

agcaccgacg 

aagctgtttt 

cttgaccacc 

agcacccgcg 

agcctggcag 

ttcgccggca 

gaggccgcca 

at cgcgcacg 

ctgcttggcg 

cccaccgagg 

ctggcggccg 

aggacgaacc 

tgttcgagcc 

ctgatgccaa 

gtctaaaaag 

tgatgcgatg 

taaccagaaa 

actcgccggg 

ggcggccgtg 

ccgcgacgtg 

ggcggacttg 

aagcccttac 

ggtcacggat 

cat cggcggt 

tatcacgcag 

agaacccgag 

actcatttga 

ggccgtccga 

gccatgaagc 

gcggtacgcc 

gagtaaatga 

ggaaaatcaa 

cggttggcca 

caagcccgag 

cgctgggtga 

tcgaggcaga 

aatcccggca 

agcaac caga 

tcatggacgt 

gctacgagct 



ctgactagta 
ggtgatgtta 
ggaaaac 1 1 a 
cttgtcacta 
cggcacgact 
ttcaaggacg 
gtcaacagga 
aagttggaat 
ggcatcaaag 
gat cat t ate 
tacctgtcca 
cttcttgagt 
agccaccacc 
aaacatttgg 
catataattt 
atttatgaga 
aaacaaaata 
agat egggaa 
aagagcgttt 
catttgtatg 
ccctccgctg 
acatgtcgea 
ttcttgtcgc 
aegecatgaa 
accaggactt 
ccgagaagat 
tacgccctgg 
acctactgga 
agccgtgggc 
ttgccgagtt 
aggeccgagg 
cccgcgagct 
tgcatcgctc 
ccaggcggcg 
ccgagaatga 
gtttttcatt 
gcccgcgcac 
gctggcggcc 
gtgatgtgta 
agt aaa t aaa 
ggcgggtcag 
gccgatgttc 
egggaagate 
aaggecateg 
gctgtgtccg 
gacatatggg 
ggaaggc t ac 
gaggttgccg 
cgcgtgagct 
ggcgacgctg 
gttaatgagg 
gcgcacgcag 
gggtcaactt 
aaggcaagac 
gcaaatgaat 
gaacaaccag 
ggcgtaagcg 
gaatcggcgt 
tgacctggtg 
agcacgcccc 
accgccggca 
ttttttcgtt 
ggccgttttc 
tecagaeggg 



aaggagaaga 
atgggcacaa 
cccttaaatt 
ctttctctta 
tcttcaagag 
aegggaacta 
tcgagcttaa 
acaactacaa 
ccaacttcaa 
aacaaaatac 
cacaatctgc 
ttgtaacagc 
accaccacca 
caataaagtt 
ctgttgaatt 
tgggttttta 
tagegegcaa 
ttaaactatc 
attagaataa 
tgcatgccaa 
etatagtgea 
caagtcctaa 
gtgttttagt 
caagagcgcc 
gaccaac caa 
caccggcacc 
cgacgttgtg 
cattgecgag 
cgacaccacc 
cgagcgttcc 
cgtgaagttt 
gatcgaccag 
gaccctgtac 
cggtgccttc 
aegecaagag 
accgaagaga 
gtctcaaccg 
tggccggcca 
tttgagtaaa 
caaatacgea 
gcaagacgac 
tgttagtcga 
aacege t aac 
gccggcgcga 
cgatcaaggc 
ccaccgccga 
aageggcett 

aggegctgge 

acccaggcac 
cccgcgaggt 
taaagagaaa 
cagcaaggct 
tcagttgccg 
cattaccgag 
aaatgagtag 
gcaccgacgc 
gctgggttgt 
gaeggtcgea 
gagaag 1 1 ga 
ggtgaatcgt 
gccggtgcgc 
ccgatgctct 
cgtctgtcga 
caegtagagg 



acttttcact 
attttctgtc 
tatttgeact 
tggtgttcaa 
cgccatgcct 
caagacacgt 
gggaatcgat 
ctcccacaac 
gacccgccac 
tccaattggc 
cctttcgaaa 
tgctgggatt 
cgtgtgaatt 
tcttaagatt 
aegttaagea 
tgattagagt 
actaggataa 
agtgtttgac 
eggatattta 
ccacagggtt 
gteggcttet 
gttacgegae 
cgcataaagt 
gccgctggcc 
cgggccgaac 
aggcgcgacc 
acagtgacca 
cgcatccagg 
acgccggccg 
ctaatcatcg 
ggcccccgcc 
gaaggecgea 
cgcgcacttg 
cgtgaggacg 
gaacaagcat 
tegaggegga 
tgeggctgea 
gcttggccgc 
acagcttgeg 
aggggaaege 
catcgcaacc 
ttccgatccc 
cgttgtcggc 
ettegtagtg 
agccgacttc 
cctggtggag 
tgtcgtgtcg 
egggtacgag 
tgccgccgcc 
ccaggcgctg 
at gage aaaa 
gcaacgttgg 
geggaggate 
ctgctatctg 
atgaatttta 
cgtggaatgc 
ctgccggccc 
aac c ate egg 
aggccgcgca 
ggcaagegge 
cgtcgattag 
atgacgtggg 
agegtgae eg 
tttcegcagg 



ggagttgtcc 
agt ggagagg 
actggaaaac 
tgcttttcaa 
Sagggatacg 
gctgaagtca 
ttcaaggagg 
gtatacatca 
aacatcgaag 
gatggccctg 
gatcccaacg 
acacatggca 
ggtgaccagc 
gaatcctgtt 
tgtaataatt' 
cccgcaatta 
attatcgege 
aggatatatt 
aaagggcgtg 
cccctcggga 
gacgttcagt 
aggctgccgc 
agaatacttg 
tgctgggcta 
tgcacgcggc 
gcccggagct 
ggctagaccg 
aggccggcgc 
gccgcatggt 
accgcacccg 
ctaccctcac 
ccgtgaaaga 
agegcagega 
cattgaccga 
gaaaccgcac 
gatgatcgeg 
tgaaatcctg 
tgaagaaacc 
teatgeggtc 
atgaaggtta 
catctagccc 
cagggcagtg 
atcgaccgcc 
ategaeggag 
gtgetgatte 
ctggttaagc 
egggegatea 
ctgcccattc 
ggcacaaccg 
geegctgaaa 
geacaaaeac 
ccagcctggc 
acaccaagct 
aatacatege 
gcggctaaag 
cccatgtgtg 
tgeaatggca 
cccggtacaa 
ggccgcccag 
egctgatega 
gaagc cgc c c 
cacccgcgat 
acgagctggc 
gccggccggc 



caattcttgt 
gtgaaggtga 
tacctgttcc 
gatacccaga 
tgcaggagag 
agtttgaggg 
aeggaaacat 
tggccgacaa 
aeggeggegt 
tccttttacc 
aaaagagaga 
tggatgaact 
tcgaatttcc 
geeggtcttg 
aacatgtaat 
tacatttaat 
gcggtgtcat 

ggcgggtaaa 
aaaaggttta 
tcaaagtact 
gcagccgtct 
cctgcccttt 
cgactagaac 
tgcccgcgtc 
cggctgcacc 
ggccaggatg 
cctggcccgc 
gggcctgcgt 
gttgaccgtg 
gagegggege 
cccggcacag 
ggeggctgea 
ggaagtgacg 
ggccgacgcc 
caggacggcc 
geegggtacg 
gccggtttgt 
gagcgccgcc 
getgegtata 
tegctgtact 
gcgccctgca 
cccgcgattg 
cgacgat tga 
cgccccaggc 
cggtgcagcc 
agegcattga 
aaggcacgcg 
ttgagtcccg 
ttcttgaatc 
ttaaatcaaa 
getaagtgee 
agacacgcca 
gaagatgtac 
gcagctacca 
gaggeggcat 
gaggaaeggg 
ctggaacccc 
ateggegegg 
cggcaacgca 
at ccgcaaag 
aagggcgacg 
agtegcagea 
gaggtgatcc 
a.tggccagtg 



60 

120 

180 

240 

300 

360 

42 0 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 

3840 
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fcgtgggatfca 
accgggaagg 
tcaagttctg 
ttcggttaaa 
tggtgacggt 
c cgggcggcc 
aaggcaagaa 
tcggccgttt 
tgttcaagac 
ccgtgcgcaa 
ggcaggctgg 
ccggttccta 
gaaaaggtct 
accggaaccc 
tgactgatat 
aaactcttaa 
tgcaaaaagc 
ctatcgcggc 
gcggacaagc 
gcgcgtttcg 
gcttgtctgt 
ggcgggtgtc 
ttaactatgc 
cgcacagatg 
actcgctgcg 
tacggttatc 
aaaaggcc ag 
cfcgacgagca 
aaagatacca 
cgcttaccgg 
cacgctgtag 
aaccccccgt 
cggtaagaca 
ggtatgtagg 
ggacagtatt 
gctcttgatc 
agattacgcg 
acgctcagtg 
acaattcatc 
agtcaaaaaa 
agaaggcaat 
ttactttgcc 
gttcctcttc 
gagtgtcttc 
ccaattcggc 
agtgaaagag 
cttcatactc 
catcatgccg 
tcatgtcctt 
ttaaatatag 
ccgtatcttt 
ttttagccat 
taattataac 
gaaaacagct 
gattfcfcgaaa 
taccctccgc 
agcatcggta 
cggactgatg 
tgttggctgg 
aataacacat 
tggattttag 
acaaatacaa 
ggaaccctaa 
gtcgatcgac 
gcgtcggttt 
tctgcgggcg 
tcgaccctgc 



cgacctggta 
gaagggagac 
ccggcgagcc 
caccacgcac 
atccgagggt 
ggagtacatc 
cccggacgtg 
tctctaccgc 
gatctacgaa 
gcfcgatcggg 
cccgatccta 
atgtacggag 
ctttcctgtg 
gtacattggg 
aaaagagaaa 
aacccgcctg 
gcctaccctt 
cgctggccgc 
cgcgccgtcg 
gtgatgacgg 
aagcggatgc 
ggggcgcagc 
ggcatcagag 
cgtaaggaga 
ctcggtcgtt 
cacagaatca 
gaaccgtaaa 
tcacaaaaat 
ggcgtttccc 
atacctgtcc 
gtatctcagt 
tcagcccgac 
cgacttatcg 
cggtgcfcaca 
tggtatctgc 
cggcaaacaa 
cagaaaaaaa 
gaacgaaaac 
cagtaaaata 
tagctcgaca 
gtcataccac 
atctttcaca 
gggcttttcc 
ttcccagttt 
taagcggctg 
cctgatgcac 
ttccgagcaa 
ttcaaagtgc 
ttcccgttcc 
gttttcattt 
tacgcagcgg 
ttattatttc 
aagacgaact 
ttttcaaagt 
ccgcggtgat 
gagatcatcc 
acatgagcaa 
ggctgcctgt 
ctggtggcag 
tgcggacgtt 
tactggattt 
atacatacta 
ttcccttatc 
agatccggtc 
ccactatcgg 
atttgtgtac 
gcccaagctg 



ctgatggcgg 
aagcccggcc 
gatggcggaa 
gttgccatgc 
gaagccttga 
gagabcgagc 
ctgacggttc 
ctggcacgcc 
cgcagtggca 
tcaaatgacc 
gtcatgcgct 
cagatgctag 
gatagcacgt 
aacccaaagc 
aaaggcgatt 
gcctgtgcat 
cggtcgctgc 
tcaaaaatgg 
ccactcgacc 
tgaaaacctc 
cgggagcaga 
catgacccag 
cagattgtac 
aaataccgca 
cggctgcggc 
ggggataacg 
aaggccgcgt 
cgacgctcaa 
cctggaagct 
gcctttctcc 
tcggfcgtagg 
cgctgcgcct 
ccactggcag 
gagttcttga 
gctctgctga 
accaccgctg 
ggatctcaag 
tcacgttaag 
taatatttta 
tactgttctt 
ttgtccgccc 
aagatgttgc 
gtcfcttaaaa 
tcgcaatcca 
tctaagctat 
tccgcataca 
aggacgccat 
aggacctttg 
acatcatagg 
tctcccacca 
tatttttcga 
cttcctcttt 
ccaattcact 
tgttttcaaa 
cacaggcagc 
gtgtttcaaa 
agtctgccgc 
atcgagtggt 
gatatattgt 
tttaatgtac 
tggttttagg 
agggtttctt 
tgggaactac 
ggcatctact 
cgagtacttc 
gcccgacagt 
catcatcgaa 



tttcccatct 
gcgtgttccg 
agcagaaaga 
agcgtacgaa 
ttagccgcta 
tagctgattg 
accccgatta 
gcgccgcagg 
gcgccggaga 
tgccggagta 
accgcaacct 
ggcaaattgc 
acattgggaa 
cgt aca 1 1 gg 
tttccgccta 
aactgtctgg 
gctccctacg 
ctggcctacg 
gccggcgccc 
tgacacatgc 
caagcccgtc 
tcacgtagcg 
tgagagtgca 
tcaggcgctc 
gagcggtatc 
caggaaagaa 
tgctggcgtfc 
gtcagaggtg 
ccctcgtgcg 
cttcgggaag 
tcgttcgctc 
tatccggtaa 
cagccactgg 
agtggtggcc 
agccagttac 
gtagcggtgg 
aagatccttt 
ggattttggt 
ttttctccca 
ccccgatatc 
tgccgcttct 
tgtctcccag 
aatcatacag 
cat cggccag 
tcgtataggg 
gctcgataat 
cggcctcact 
gaacaggcag 
tggtcccttt 
gcttatatac 
tcagtttttt 
tctacagtat 
gttccttgca 
gttggcgtat 
aacgctctgt 
cccggcagct 
cttacaacgg 
gattttgtgc 
ggtgtaaaca 
tgaattaacg 
aattagaaat 
atatgctcaa 
tcacacatta 
ctatttcttt 
tacacagcca 
cccggctccg 
attgccgtca 



aaccgaatcc 
tccacacgtt 
cgacctggta 
gaaggccaag 
caagatcgta 
gatgtaccgc 
ctttttgatc 
caaggcagaa 
gttcaagaag 
cgatttgaag 
gatcgagggc 
cctagcaggg 
cccaaagccg 
gaaccggtca 
aaactcttta 
ccagcgcaca 
ccccgccgct 
gccaggcaat 
acatcaaggc 
agctcccgga 
agggcgcgtc 
atagcggagt 
ccatatgcgg 
ttccgcttcc 
agctcactca 
catgtgagca 
tttccatagg 
gcgaaacccg 
ctctcctgtt 
cgtggcgctt 
caagctgggc 
ctatcgtctt 
taacaggatt 
taactacggc 
cttcggaaaa 
tttttttgtt 
gatcttttct 
catgcattct 
atcaggcttg 
ctccctgatc 
cccaagatca 
gtcgccgtgg 
ctcgcgcgga 
atcgttattc 
acaatccgat 
cttttcaggg 
catgagcaga 
ctttccttcc 
ataccggctg 
cttagcagga 
caattccggt 
ttaaagatac 
ttctaaaacc 
aacatagtat 
catcgttaca 
tagttgccgt 
ctctcccgct 
cgagctgccg 
aattgacgct 
ccgaattaat 
tttattgata 
cacatgagcg 
ttatggagaa 
gccctcggac 
tcggtccaga 
gatcggacga 
accaagctct 



atgaaccgat 
gcggacgtac 
gaaacctgca 
aacggccgcc 
aagagcgaaa 
gagatcacag 
gatcccggca 
gccagatggt 
ttctgtttca 
gaggaggcgg 
gaagcatccg 
gaaaaaggtc 
tacattggga 
cacatgtaag 
aaacttatta 
gccgaagagc 
tcgcgtcggc 
ctaccagggc 
accctgcctc 
gacggtcaca 
agcgggtgtt 
gtatactggc 
tgtgaaatac 
tcgctcactg 
aaggcggtaa 
aaaggccagc 
ctccgccccc 
acaggactat 
ccgaccctgc 
tctcatagct 
tgtgtgcacg 
gagtccaacc 
agcagagcga 
tacactagaa 
agagt t ggt a 
tgcaagcagc 
acggggtctg 
aggtactaaa 
atccccagta 
gaccggacgc 
ataaagccac 
gaaaagacaa 
tctttaaatg 
agtaagtaat 
atgtcgatgg 
ctttgttcat 
ttgctccagc 
agccatagca 
tccgtcattt 
gacattcctt 
gatattctca 
cccaagaagc 
ttaaatacca 
cgacggagcc 
a t caaca tgc 
tcttccgaat 
gacgccgtcc 
gtcggggagc 
tagacaactt 
tcgggggatc 
gaagtatttt 
aaaccctata 
actcgagctt 
gagtgctggg 
cggccgcgct 
ttgcgtcgca 
gatagagttg 



3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 
5100 
5160 
5220 
5280 
5340 
5400 
5460 
5520 
5580 
5640 
5700 
5760 
5820 
5880 
5940 
6000 
6060 
6120 
6180 
6240 
6300 
6360 
6420 
6480 
6540 
6600 
6660 
6720 
6780 
6840 
6900 
6960 
7020 
7080 
7140 
7200 
7260 
7320 
7380 
7440 
7500 
7560 
762 0 
7680 
7740 
7800 
7860 
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gtcaagacca 
cctccgctcg 
gatgttggcg 
tgttatgcgg 
ccggac t teg 
cgcactgacg 
gcatatgaaa 
cccgctcgtc 
tagaacagcg 
ggagatgcaa 
gagcgcggcc 
gctatttacc 
ttcgccctcc 
ctcgacagac 
gaaagctcga 
aatgaaatga 
atcccttacg 
gtcttctttt 
agaggcatct 
ttccttttct 
gtttcccgat 
atctttgata 
cacttgettt 
gggtccatct 
gcaatgatgg 
gatagctggg 
aatagccctfc 
gtgctccacc 
tggecgatte 
cgcaacgcaa 
cttccggctc 
tatgaccatg 
aacgacaatc 
cgcgggacaa 
agecgegggt 
ttcaaaagtc 
tgacgttcca 
ataatctgea 



atgeggagea 
aagtagcgcg 
acctegtatt 
ccattgtccg 
gggcagtcct 
gtgtcgtcca 
tcacgccatg 
tggctaagat 
ggcagttcgg 
taggtcaggc 
gatgeaaagt 
cgcaggacat 
gagagctgea 
gtcgcggtga 
gagagat aga 
acttccttafc 
tcagtggaga 
tccacgatgc 
tgaacgat ag 
actgtccttt 
attacccttt 
ttcttggagt 
gaagacgtgg 
ttgggaccac 
catttgtagg 
caatggaatc 
tggtcttctg 
atgttggcaa 
attaatgeag 
ttaatgtgag 
gtatgttgtg 
attacgaatt 
tgatcatgag 
geegttttae 
ttctggagtt 
gectaaggtc 
taaattcccc 
ccggatctcg 



tatacgcccg 
tctgctgctc 
gggaatcccc 
tcaggacatt 
cggcccaaag 
tcacagtttg 
tagtgtattg 
cggccgcagc 
tttcaggcag 
tetegctaaa 
gecgataaac 
atccacgccc 
teaggtegga 
gttcaggctt 
tttgtagaga 
a t agagga ag 
tatcacatca 
tcctcgtggg 
cctttccttt 
tgatgaagtg 
gttgaaaagt 
agacgagagt 
ttggaacgtc 
tgteggcaga 
tgccaccttc 
cgaggaggtt 
agactgtatc 
getgetctag 
ctggcacgac 
ttagctcact 
tggaattgtg 
egagcteggt 
eggagaatta 
gtttggaact 
taatgagcta 
actatcagct 
teggtatcca 
agaatcgaat 



-60- 

gagtcgtggc 
catacaagcc 
gaacatcgcc 
gttggagccg 
catcagctca 
ccagtgatac 
accgattcct 
gatcgcatcc 
gtcttgeaac 
ctccccaatg 
ataacgatct 
tcctacatcg 
gaegctgteg 
tttcatatct 
gagactggtg 
gfccttgcgaa 
atccacttgc 
tgggggtcca 
ategcaatga 
acagatagct 
ctcaatagcc 
gtcgtgctcc 
ttctttttcc 
ggcatcttga 
cttttctact 
tcccgatatt 
tttgatattc 
ccaatacgca 
aggtttcccg 
cat t aggcac 
ageggataac 
acceggggat 
agggag t c a c 
gacagaaccg 
agcacatacg 
agcaaatatt 
attagagtct 
tcccgcggcc 



gatcctgeaa 
aaccacggcc 
tcgctccagt 
aaatccgcgt 
tegagagect 
acatggggat 
tgeggtcega 
atagcctccg 
gtgacaccct 
tcaagcactt 
ttgtagaaac 
aagctgaaag 
aacttttcga 
cattgccccc 
atttcagegt 
ggatagtggg 
tttgaagacg 
tctttgggac 
tggcatttgt 
gggcaatgga 
ctttggtctt 
accatgttat 
acgatgctcc 
aegatagect 
gtccttttga 
accctttgtt 
ttggagtaga 
aaccgcctct 
actggaaagc 
cccaggct 1 1 
aatttcacac 
cctctagact 
gttatgaccc 
caacgttgaa 
tcagaaacca 
tcttgtcaaa 
catattcact 

gc 



getceggatg 
tccagaagaa 
caatgaccgc 
gcacgaggtg 
gcgcgacgga 
cagcaatcgc 
atgggccgaa 
cgaccggttg 
gtgcacggcg 
ceggaategg 
catcggcgca 
cacgagattc 
tcagaaactt 
ccggatctgc 
gtcctctcca 
attgtgcgtc 
tggttggaac 
cactgtcggc 
aggtgccacc 
atccgaggag 
ctgagactgt 
cacatcaatc 
tcgtgggtgg 
ttcctttatc 
tgaagtgaca 
gaaaagtctc 
cgagagtgtc 
ccccgcgcgt 
gggcagtgag 
acactttatg 
aggaaacagc 
gaaggcggga 
ccgccgatga 
ggagc cactc 
ttattgcgcg 
aatgctccac 
ctcaatccaa 



<210> 98 
<211> 621 
<212> DNA 

<213> Artificial Sequence 
<220> 

<22 3> N. tabacum rDNA intergnic spacer (IGS) sequence 
^300> 

<308> Genbank #Y08422 
<309> 1997-10-31 



7920 

7980 

8040 

8100 

8160 

8220 

8280 

8340 

8400 

8460 

8520 

8580 

8640 

8700 

8760 

8820 

8880 

8940 

9000 

9060 

9120 

9180 

9240 

9300 

9360 

9420 

9480 

9540 

9600 

9660 

9720 

9780 

9840 

9900 

9960 

10020 

10080 

10122 



<400> 98 

gtgetageca 

gctggcggtg 

tgcagcggtg 

gttattggtg 

ttacatattt 

tgttttataa 

ttctccattg 

attttttcgt 

tttacaatgt 

tttggtgttg 

gggttttttt 



atgtttaaca 
gtggaaaatt 
tttgatatcg 
gttggtcatc 
tttattaaat 
aatattttat 
ttttttctat 
tttataataa 
ttaaaagtca 
tacatgtcta 
ttttaagaca 



agatgtcaag 
gcggtggttc 
gaatcactta 
tatatatttt 
ttatgcattg 
tattttatgt 
atttataata 
atatttatta 
tttgtgaata 
ttatgattct 
t 



cacaatgaat 
gageggtagt 
tggtggttgt 
tataataata 
tttgtatttt 
gttatattat 
attttcttat 
aaaaaaatat 
tattagctaa 
ctggccaaaa 



gttggtggtt 
gat eggegat 
cacaatggag 
ttaagtattt 
taaatagttt 
tacttgatgt 
ttttttttgt 
tatttttgta 
gttgtacttc 
catgtctact 



ggtggtcgtg 
ggttggtgtt 
gtgcgtcatg 
tacctatttt 
ttategtact 
attggaaatt 
tttattatgt 
aaatatatca 
tttttgtgca 
cctgtcactt 



60 

120 

180 

240 

300 

360 

420 

480 

540 

6O0 

621 



<210> 99 
<211> 25 
<212> DNA 



WO 02/097059 PCT7US02/17452 

-61- 

<213> Artificial Sequence 
<220> 

<223> NTIGS-F1 Primer 
<400> 99 

gtgctagcca atgtttaaca agatg 25 

<210> 100 
<211> 28 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> NTIGS-R1 Primer 
<400> 100 

atgtcttaaa aaaaaaaacc caagtgac 2 8 

<210> 101 

<211> 233 

<212> DNA 

<213> Mus Musculus 

<300> 

<3 0 8> Genbank #V0084 6 
<309> 1989-07-06 

<400> 101 

gacctggaat atggcgagaa aactgaaaat cacggaaaat gagaaataca cactttagga 60 

cgtgaaatat ggcgaggaaa actgaaaaag gtggaaaatt tagaaatgtc cactgtagga 120 

cgtggaatat ggcaagaaaa ctgaaaatca tggaaaatga gaaacatcca cttgacgact 180 

tgaaaaatga cgaaatcact aaaaaacgtg aaaaatgaga aatgcacact gaa 233 

<210> 102 

<211> 31 

<212> DNA 

<213> Artificial Sequence 
<220> 

<2 23> MSAT-F1 Primer 

<400> 102 

aataccgcgg aagcttgacc tggaatatcg c 31 

<210> 103 

<211> 27 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> MSAT-Ri Primer 

<400> 103 

ataaccgcgg agtccttcag tgtgcat 27 

<210> 104 
<211> 277 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Nopal ine Synthase Promoter Sequence 



<300> 

<308> Genbank #U09365 
<309> 1997-10-17 



WO 02/097059 PCT/US02/17452 

-62- 

<400> 104 

gagctcgaat ttccccgatc gttcaaacat ttggcaataa agtttcttaa gattgaatcc 60 
tgttgccggt cttgcgatga ttatcatata atttctgttg aattacgtta agcatgtaat 12 0 
aattaacatg taatgcatga cgttatttat gagatgggtt tttatgatta gagtcccgca 180 
attatacatt taatacgcga tagaaaacaa aatatagcgc gcaaactagg ataaattatc 24 0 
gcgcgcggtg tcatctatgt tactagatcg ggaattc 277 

<210> 105 
<211> 1812 
<212> DNA 

<213> Escherichia coli 

<220> 

<221> CDS 

<222> (1) . . . (1812) 

<223> Beta -Glucuronidase 

<300> 

<308> Genbank #S69414 
<309> 1994-09-23 

<400> 105 

atg tta cgt cct gta gaa acc cca acc cgt gaa ate aaa aaa etc gac 48 
Met Leu Arg Pro Val Glu Thr Pro Thr Arg Glu lie Lys Lys Leu Asp 
15 10 15 

ggc ctg tgg gca ttc agt ctg gat cgc gaa aac tgt gga att gat cag 96 
Gly Leu Trp Ala Phe Ser Leu Asp Arg Glu Asn Cys Gly lie Asp Gin 
20 25 30 

cgt tgg tgg gaa age gcg tta caa gaa age egg gca att get gtg cca 144 
Arg Trp Trp Glu Ser Ala Leu Gin Glu Ser Arg Ala lie Ala Val Pro 
35 40 45 

3S C a 9 fc ttt aac gat cag ttc gec gat gca gat att cgt aat tat gcg 192 
Gly Ser Phe Asn Asp Gin Phe Ala Asp Ala Asp He Arg Asn Tyr Ala 
50 55 60 

ggc aac gtc tgg tat cag cgc gaa gtc ttt ata ccg aaa ggt tgg gca 24 0 
Gly Asn Val Trp Tyr. Gin Arg Glu Val Phe lie Pro Lys Gly Trp Ala 
65 ^70 75 80 

ggc cag cgt ate gtg ctg cgt ttc gat gcg gtc act cat tac ggc aaa 288 
Gly Gin Arg He Val Leu Arg Phe Asp Ala Val Thr His Tyr Gly Lys 
85 90 95 

gtg tgg gtc aat aat cag gaa gtg atg gag cat cag ggc ggc tat acg 336 
Val Trp Val Asn Asn Gin Glu Val Met Glu His Gin Gly Gly Tyr Thr 
100 105 110 

cca ttt gaa gec gat gtc acg ccg tat gtt att gee ggg aaa agt gta 384 
Pro Phe Glu Ala Asp Val Thr Pro Tyr Val lie Ala Gly Lys Ser Val 
115 120 125 

cgt ate acc gtt tgt gtg aac aac gaa ctg aac tgg cag act ate ccg 432 
Arg lie Thr Val Cys Val Asn Asn Glu Leu Asn Trp Gin Thr lie Pro 
130 135 140 

ccg gga atg gtg att acc gac gaa aac ggc aag aaa aag cag tct tac 480 
Pro Gly Met Val lie Thr Asp Glu Asn Gly Lys Lys Lys Gin Ser Tyr 
145 150 155 160 

ttc cat gat ttc ttt aac tat gee gga ate cat cgc age gta atg etc 52 8 
Phe His Asp Phe Phe Asn Tyr Ala Gly lie His Arg Ser Val Met Leu 
165 170 175 

tac acc acg ccg aac acc tgg gtg gac gat ate acc gtg gtg acg cat 57 6 
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Tyr Thr Thr Pro Aen Thr Trp Val Asp Asp lie Thr Val Val Thr His 

180 185 190 

gtc gcg caa gac tgt aac cac gcg tct gtt gac tgg cag gtg gtg gcc 624 
Val Ala Gin Asp Cys Asn His Ala Ser Val Asp Trp Gin Val Val Ala 
195 * 200 205 

aat ggt gat gtc age gtt gaa ctg cgt gat gcg gat caa cag gtg gtt 672 
Asn Gly Asp Val Ser Val Glu Leu Arg Asp Ala Asp Gin Gin Val Val 
210 215 220 

gca act gga caa ggc act age ggg act ttg caa gtg gtg aat ccg cac 720 

Ala Thr Gly Gin Gly Thr Ser Gly Thr Leu Gin Val Val Asn Pro His 
225 230 235 240 

etc tgg caa ccg ggt gaa ggt tat etc tat gaa ctg tgc gtc aca gcc 7 68 

Leu Trp Gin Pro Gly Glu Gly Tyr Leu Tyr Glu Leu Cys Val Thr Ala 
245 250 255 

aaa age cag aca gag tgt gat ate tac ccg ctt cgc gtc ggc ate egg 816 

Lys Ser Gin Thr Glu Cys Asp lie Tyr Pro Leu Arg Val Gly lie Arg 

260 265 270 

tea gtg gca gtg aag ggc gaa cag ttc ctg att aac cac aaa ccg ttc 864 

Ser Val Ala Val Lys Gly Glu Gin Phe Leu lie Asn His Lys Pro Phe 
275 280 285 

tac ttt act ggc ttt ggt cgt cat gaa gat gcg gac ttg cgt ggc aaa 912 

Tyr Phe Thr Gly Phe Gly Arg Hie Glu Asp Ala Asp Leu Arg Gly Lys 
290 295 300 

gga ttc gat aac gtg ctg atg gtg cac gac cac gca tta atg gac tgg 96 O 

Gly Phe Asp Asn Val Leu Met Val His Asp His Ala Leu Met Asp Trp 
305 310 315 320 

att ggg gcc aac tec tac cgt ace teg cat tac cct tac get gaa gag 1008 

lie Gly Ala Asn Ser Tyr Arg Thr Ser His Tyr Pro Tyr Ala Glu Glu 
325 330 335 

atg etc gac tgg gca gat gaa cat ggc ate gtg gtg att gat gaa act 1056 

Met Leu Asp Trp Ala Asp Glu His Gly lie Val Val lie Asp Glu Thr 

340 345 350 

get get gtc ggc ttt aac etc tct tta ggc att ggt ttc gaa gcg ggc 1104 

Ala Ala Val Gly Phe Asn Leu Ser Leu Gly lie Gly Phe Glu Ala Gly 
355 360 365 

aac aag ccg aaa gaa ctg tac age gaa gag gca gtc aac ggg gaa act 1152 

Asn Lys Pro Lys Glu Leu Tyr Ser Glu Glu Ala Val Asn Gly Glu Thr 
370 ' 375 380 

cag caa gcg cac tta cag gcg att aaa gag ctg ata gcg cgt gac aaa 12 0 0 

Gin Gin Ala His Leu Gin Ala lie Lys Glu Leu lie Ala Arg Asp Lys 
385 390 ~ 395 400 

aac cac cca age gtg gtg atg tgg agt att gcc aac gaa ccg gat acc 124 8 

Asn His Pro Ser Val Val Met Trp Ser He Ala Asn Glu Pro Asp Thr 
405 410 415 

cgt ccg caa ggt gca egg gaa tat ttc gcg cca ctg gcg gaa gca acg 12 96 

Arg Pro Gin Gly Ala Arg Glu Tyr Phe Ala Pro Leu Ala Glu Ala Thr 

420 425 430 

cgt aaa etc gac ccg acg cgt ccg ate acc tgc gtc aat gta atg ttc 1344 

Arg Lys Leu Asp Pro Thr Arg Pro lie Thr Cys Val Asn Val Met Phe 
435 440 445 



WO 02/097059 



PCT/US02/17452 



-64- 

tgc gac get cac acc gat acc ate age gat etc ttt gat gtg ctg tgc X3 92 
Cys Asp Ala His Thr Asp Thr He Ser Asp Leu Phe Asp Val Leu Cys 
450 455 460 

ctg aac cgt tat tac gga tgg tat gtc caa age ggc gat ttg gaa acg 144 0 
Leu Asn Arg Tyr Tyr Gly Trp Tyr Val Gin Ser Gly Asp Leu Glu Thr 
465 470 475 480 



gca gag aag gta ctg gaa aaa gaa ctt ctg gee tgg cag gag aaa ctg 
Ala Glu Lys Val Leu Glu Lys Glu Leu Leu Ala Trp Gin Glu Lys Leu 
4 85 " 490 4 95 



1488 



cat cag ccg att ate ate acc gaa tac ggc gtg gat acg tta gec ggg 1536 
His Gin Pro He He lie Thr Glu Tyr Gly Val Asp Thr Leu Ala Gly 
500 505 510 

ctg cac tea atg tac acc gac atg tgg agt gaa gag tat cag tgt gca 15 84 
Leu His Ser Met Tyr Thr Asp Met Trp Ser Glu Glu Tyr Gin Cys Ala 
515 520 525 

tgg ctg gat atg tat cac cgc gtc ttt gat cgc gtc age gee gtc gtc 1632 
Trp Leu Asp Met Tyr His Arg Val Phe Asp Arg Val Ser Ala Val Val 
530 535 540 

ggt gaa cag gta tgg aat ttc gec gat ttt gcg acc teg caa ggc ata 1680 
Gly Glu Gin Val Trp Asn Phe Ala Asp Phe Ala Thr Ser Gin Gly He 
545 550 555 560 



ttg cgc gtt ggc ggt aac aag aaa ggg ate ttc act cgc gac cgc aaa 
Leu Arg Val Gly Gly Asn Lys Lys Gly He Phe Thr Arg Asp Arg Lys 
565 570 575 



1728 



ccg aag teg gcg get ttt ctg ctg caa aaa cgc tgg act ggc atg aac 
Pro Lys Ser Ala Ala Phe Leu Leu Gin Lys Arg Trp Thr Gly Met Asn 
580 585 590 



1776 



ttc ggt gaa aaa ccg cag cag gga ggc aaa caa tga 
Phe Gly Glu Lys Pro Gin Gin Gly Gly Lys Gin * 
595 600 



1812 



<210> 106 




























<211> 603 




























<212> PRT 




























<213> Escherichia coli 






















<300> 






























<3 0 8> Genbank #S69414 






















<309> 1994-1 


D9-23 
























<400> 106 




























Met 


Leu. 


Arg 


Pro 


Val 


Glu 


Thr 


Pro 


Thr 


Arg 


Glu 


He 


Lys 


Lys 


Leu 


Asp 


1 








5 










10 










15 




Gly 


Leu 


Trp 


Ala 


Phe 


Ser 


Leu 


Asp 


Arg 


Glu 


Asn 


Cys 


Qiy 


He 


Asp 


Gin 






20 










25 










30 






Arg 


Trp 


Trp 


Glu 


Ser 


Ala 


Leu 


Gin 


Glu 


Ser 


Arg 


Ala 


He 


Ala 


Val 


Pro 


35 










40 










45 








Gly 


Ser 


Phe 


Asn 


Asp 


Gin 


Phe 


Ala 


Asp 


Ala 


Asp 


He 


Arg 


Asn 


Tyr 


Ala 


50 










55 










60 










Gly 


Asn 


Val 


Trp 


Tyr 


Gin 


Arg 


Gin 


Val 


Phe 


He 


Pro 


Lys 


Gly 


Trp 


Ala 


65 








70 










75 










80 


Gly 


Gin 


Arg 


He 


Val 


Leu 


Arg 


Phe 


Asp 


Ala 


Val 


Thr 


His 


Tyr 


Gly 


Lys 






85 










90 










95 




Val 


Trp 


Val 


Asn 


Asn 


Gin 


Glu 


Val 


Met 


Glu 


His 


Gin 


Gly 


Gly 


Tyr 


Thr 






100 










105 










110 






Pro 


Phe 


Glu 


Ala 


Asp 


Val 


Thr 


Pro 


Tyr 


Val 


He 


Ala 


Gly 


Lys 


Ser 


Val 






115 








120 










125 
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Arg 


He 


Thr 


Val 


Cys 


Val 


Asn 


Asn 


Glu 


Leu 


Asn 


Trp 


Gin 


Thr 


± le 


Pro 


13 0 










135 










140 










Pro 


Gly 


Met 


Val 


lie 


Thr 


Asp 


Glu 


Asn 


Giy 


Lys 


Lys 


Lys 


Gin 


Ser 


Tyr 


145 








150 










155 










160 


Plie 


His 


Asp 


Phe 


Phe 


Asn 


Tyr 


Ala 


Gly 


lie 


His 


Arg 


Ser 


vai 


Met 


Leu 








165 










1 / U 










J- /iD 




Tyr 


Thr 


Thr 


Pro 


Asn 


Thr 


Trp 


Val 


Asp 


Asp 


lie 


Thr 


Val 


_ 

Val 


Thr 


±11 S 






180 








185 










190 






Val 


Ala 


Gin 
195 


Asp 


Cys 


Asn 


His 


Ala 
200 


Ser 


Val 


Asp 


Trp 


Gin 
205 


Val 


Val 


Ala 


As ii 


Gly 


Asp 


Val 


Ser 


Val 


Glu 


Leu 


Arg 


Asp 


Ala 


Asp 


Gin 


Gin 


vai 


vai 




210 








215 










220 










Ala 


Thr 


Gly 


Gin 


Gly 


Thr 


Ser 


Gly 


Thr 


Leu 


Gin 


Val 


Val 


Asn 


Pro 


TT * _ 

HIS 


225 








230 










235 










z4U 


lieu 


Trp 


Gin 


Pro 


Gly 


Glu 


Gly 


Tyr 


Leu 


Tyr 


Glu 


Leu 


Cys 


vai 


Tnr 


Aia 








245 










25 0 














Lys 


Ser 


Gin 


Thr 


Glu 


Cys 


Asp 


He 


Tyr 


Pro 


Leu 


Arg 


Val 


Gly 


-I- 1 _ 

He 


Arg 






260 










265 










270 






Sex 


Val 


Ala 


Val 


Lys 


Gly Glu Gin 


Phe 


Leu 


He 


Asn 


His 


Lys 


Pro 


Prie 






275 








280 










285 








Tyr 


Phe 
290 


Thr 


Gly 


Phe 


Gly 


Arg 
295 


His 


Glu 


Asp 


Ala 


Asp 
300 


Leu 


Arg 


Gly 


Lys 


Gly 


Phe 


Asp 


Asn 


Val 


Leu 


Met 


Val 


His 


Asp 


His 


Ala 


Leu 


Met 


Asp 


Trp 


305 










310 










315 










320 


He 


Gly 


Ala 


Asn 


Ser 
32 5 


Tyr 


Arg 


Thr 


Ser 


His 
33 0 


Tyr 


Pro 


Tyr 


Ala 


Glu 

— -J r- 


Glu 


Met 


Leu 


Asp 


Trp 
340 


Ala 


Asp 


Glu 


His 


Gly 
345 


He 


Val 


Val 


He 


Asp 
350 


Glu 


Thr 


Ala 


Ala 


Val 
355 


Gly 


Phe 


Asn 


Leu 


Ser 
360 


Leu 


Gly 


He 


Gly 


Phe 
365 


Glu 


Ala 


Gly 


Asn 


Lys 


Pro 


Lys 


Glu 


Leu 


Tyr 


Ser 


Glu 


Glu 


Ala 


Val 


Asn 


Gly 


Glu 


Thr 




370 








375 










380 










Gin 


Gin 


Ala 


His 


Leu 


Gin 


Ala 


He 


Lys 


Glu 


Leu 


He 


Ala 


Arg 


Asp 


Lys 


3 85 










390 










395 










4 00 


Asn 


Hxs 


Pro 


Ser 


Val 
405 


Val 


Met 


Trp 


Ser 


lie 
410 


Ala 


Asn 


Glu 


Pro 


Asp 
415 


Thr 


Arg 


Pro 


Gin 


Gly 


Ala 


Arg 


Glu 


Tyr 


Phe 


Ala 


Pro 


Leu 


Ala 


Glu 


Ala 


Thr 






420 








425 










430 






Arg 


Lys 


Leu 
435 


Asp 


Pro 


Thr 


Arg 


Pro 
440 


He 


Thr 


Cys 


Val 


Asn 
445 


Vai 


Met 


Phe 


Cys 


Asp 
450 


Ala 


His 


Thr 


Asp 


Thr 
455 


He 


Ser 


Asp 


Leu 


Phe 
460 


Asp 


tr_ t 

Val 


Leu 


Cys 


Leu 


Asn 


Arg 


Tyr 


Tyr 


Gly 


Trp 


Tyr 


Val 


Gin 


Ser 


Gly 


Asp 


Leu 


Glu 


Thr 


465 










470 










475 










480 


Ala 


Glu 


Lys 


Val 


Leu 
485 


Glu 


Lys 


Glu 


Leu 


Leu 
490 


Ala 


Trp 


Gin 


Glu 


Lys 
4 95 


Leu 


His 


Gin 


Pro 


He 
500 


He 


He 


Thr 


Glu 


Tyr 
505 


Gly 


Val 


Asp 


Thr 


Leu 
510 


Ala 


Gly 


lieu 


His 


Ser 
515 


Met 


Tyr 


Thr 


Asp 


Met 
520 


Trp 


Ser 


Glu 


Glu 


Tyr 
525 


Gin 


Cys 


Ala 


Trp 


Leu 


Asp 


Met 


Tyr 


His 


Arg Val 


Phe 


Asp Arg 


Val 


Ser 


Ala 


Val 


Val 




530 










535 










540 










Gly 


Glu 


Gin 


Val 


Trp 


Agri 


Phe 


Ala 


Asp 


Phe 


Ala 


Thr 


Ser 


Gin 


Gly 


He 


545 








550 










555 










560 


Leu 


Arg 


Val 


Gly 


Gly 
565 


Asn 


Lys 


Lys 


Gly 


He 
570 


Phe 


Thr 


Arg 


Asp 


Arg 
575 


Lys 


Pro 


Lys 


Ser 


Ala 
580 


Ala 


Phe 


Leu 


Leu 


Gin 
585 


Lys 


Arg 


Trp 


Thr 


Gly 
590 


Met 


Asn 


Phe 


Gly 


Glu 
595 


Lys 


Pro 


Gin 


Gin 


Gly 
600 


Gly 


Lys 


Gin 













<210> 107 
<2H> 277 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Nopaline Synthase Terminator Sequence 
<300> 

<308> U09365 
<309> 1995-10-17 



<400> 107 

gagctcgaat ttccccgatc gttcaaacat ttggcaataa agtttcttaa gattgaatcc 60 
tgttgccggt cttgcgatga ttatcatata atttctgttg aattacgtta agcatgtaat 12 0 
aattaacatg taatgcatga cgttatttat gagatgggtt tttatgatta gagtcccgca 180 
attatacatt taatacgcga tagaaaacaa aatatagcgc gcaaactagg ataaattatc 24 0 
gcgcgcggtg tcatctatgt tactagatcg ggaattc 277 

<210> 108 
<211> 3451 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Hindi I I Fragment containing the beta-glucuronidase 
coding sequence, the rDNA intergenic spacer, and 
the Mastl sequence 



<400> 108 

aagcttgacc 

ttaggacgtg 

gtaggacgtg 

acgacttgaa 

gactccgcgg 

gttggtggtt 

gatcggcgat 

cacaatggag 

ttaagtattt 

taaatagttt 

tacttgatgt 

ttttttttgt 

tatttttgta 

gttgtacttc 

catgtctact 

tagactgaag 

tgacccccgc 

gttgaaggag 

aaaccattat 

gtcaaaaatg 

ttcactctca 

ttcactagtg 

cccgtgaaat 

gaattgagca 

gcagttttaa 

atcagcgcga 

atgcggtcac 

gcggctatac 

gtatcacagt 

ttaccgacga 

ggatccatcg 

tggtgacgca 

atggtgatgt 

gcaccagcgg 

tctatgaact 

tcggcatccg 

actttactgg 

tgctgatggt 

cgcattaccc 

ttgatgaaac 

acaagccgaa 

tacaggcgat 



tggaatatcg 
aaatatggcg 
gaatatggca 
aaatgacgaa 
gaattcgatt 
ggtggtcgtg 
ggttggtgtt 
gtgcgtcatg 
tacctatttt 
ttatcgtact 
attggaaatt 
tttattatgt 
aaatatatca 
tttttgtgca 
cctgtcactt 
gcgggaaacg 
cgatgacgcg 
ccactcagcc 
tgcgcgttca 
ctccactgac 
atccaaataa 
gatccccggg 
caaaaaactc 
gcgttggtgg 
cgatcagttc 
agtctttata 
tcattacggc 
gccatttgaa 
ttgtgtgaac 
aaacggcaag 
cagcgtaatg 
tgtcgcgcaa 
cagcgttgaa 
gactttgcaa 
gtacgtcaca 
gtcagtggca 
ctttggccgt 
gcacgatcac 
ttacgctgaa 
tgcagctgtc 
agaactgtac 
taaagagctg 



cgagtaaact 
aggaaaactg 
agaaaactga 
atcactaaaa 
gtgctagcca 
gctggcggtg 
tgcagcggtg 
gttattggtg 
ttacatattt 
tgttttataa 
ttctccattg 
attttttcgt 
tttacaatgt 
tttggtgttg 
gggttttttt 
acaatctgat 
ggacaagccg 
gcgggtttct 
aaagtcgcct 
gttccataaa 
tctgcaccgg 
tacggtcagt 
gacggcctgt 
gaaagcgcgt 
gccgatgcag 
ccgaaaggtt 
aaagtgtggg 
gccgatgtca 
aacgaactga 
aaaaagcagt 
ctctacacca 
gactgtaacc 
ctgcgtgatg 
gtggtgaatc 
gccaaaagcc 
gtgaagggcg 
catgaagatg 
gcattaatgg 
gagatgctcg 
ggctttaacc 
agcgaagagg 
atagcgcgtg 



gaaaatcacg 
aaaaaggtgg 
aaatcatgga 
aacgtgaaaa 
atgtttaaca 
gtggaaaatt 
tttgatatcg 
gttggtcatc 
tttattaaat 
aatattttat 
ttttttctat 
tttataataa 
ttaaaagtca 
tacatgtcta 
ttttaagaca 
catgagcgga 
ttttacgttt 
ggagtttaat 
aaggtcacta 
ttcccctcgg 
atctcgagat 
cccttatgtt 
gggcattcag 
tacaagaaag 
atattcgtaa 
gggcaggcca 
tcaataatca 
cgccgtatgt 
actggcagac 
cttacttcca 
cgccgaacac 
acgcgtctgt 
cggatcaaca 
cgcacctctg 
agacagagtg 
aacagttcct 
cggatttgcg 
actggattgg 
actgggcaga 
tctctttagg 
cagtcaacgg 
acaaaaacca 



gaaaatgaga 
aaaatttaga 
aaatgagaaa 
atgagaaatg 
agatgtcaag 
gcggtggttc 
gaatcactta 
tatatatttt 
ttatgcattg 
tattttatgt 
atttataata 
atatttatta 
tttgtgaata 
ttatgattct 
taatcactag 
gaattaaggg 
ggaactgaca 
gagctaagca 
tcagctagca 
tatccaatta 
cgaattcccg 
acgtcctgta 
tctggatcgc 
ccgggcaatt 
ttatgtgggc 
gcgtatcgtg 
ggaagtgatg 
tattgccggg 
tatcccgccg 
tgatttcttt 
ctgggtggac 
tgactggcag 
ggtggttgca 
gcaaccgggt 
tgatatctac 
gat c aac c ac 
cggcaaagga 
ggccaactcc 
tgaacatggc 
cattggtttc 
ggaaactcag 
cccaagcgtg 



aatacacact 
aatgtccact 
catccacttg 
cacactgaag 
cacaatgaat 
gagcggtagt 
tggtggttgt 
tataataata 
tttgtatttt 
gttatattat 
attttcttat 
aaaaaaatat 
tattagctaa 
ctggccaaaa 
tgattatatc 
agtcacgtta 
gaaccgcaac 
catacgtcag 
aatatttctt 
gagtctcata 
cggccgcgaa 
gaaaccccaa 
gaaaactgtg 
gctgtgccag 
aacgtctggt 
ctgcgtttcg 
gagcatcagg 
aaaagtgtac 
ggaatggtga 
aactacgccg 
gatatcaccg 
gtggtggcca 
actggacaag 
gaaggttatc 
ccgctgcgcg 
aaaccgttct 
ttcgataacg 
taccgtacct 
atcgtggtga 
gaagcgggca 
caggcgcact 
gtgatgtgga 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

72 0 

780 

84 0 

900 

960 

1020 

1080 

1140 

12,0 0 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

24 0 0 

2460 

2520 
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gtattgccaa 
cggaagcaac 
gcgacgctca 
acggttggta 
ttctggcctg 
cgttagccgg 
ggctggatat 
ggaatttcgc 
ggatcttcac 
ctggcatgaa 
ctggcgcacc 
tcgttcaaac 
gattatcata 
gacgttattt 
gatagaaaac 
gttactagat 



cgaaccggat 
gcgtaaactc 
caccgatacc 
tgfcccaaagc 
gc aggagaaa 
gctgcactca 
gtatcaccgc 
cgatfcttgcg 
ccgcgaccgc 
cttcggtgaa 
atcgtcggct 
atttggcaat 
taatttctgt 
atgagatggg 
aaaatatagc 
cgggaattcg 



acccgtccgc 
gatccgacgc 
atcagcgatc 
ggcgatttgg 
ctgcatcagc 
atgtacaccg 
gtctttgatc 
acctcgcaag 
aaaccgaagt 
aaaccgcagc 
acagccfccgg 
aaagtttctt 
tgaattacgt 
tttttatgat 
gcgcaaacta 
atatcaagct 



aaggtgcacg 
gtccgatcac 
fcctfctgatgt 
aaacggcaga 
cgattatcat 
acatgtggag 
gcgtcagcgc 
gcafcattgcg 
cggcggcttt 
agggaggcaa 
gaattgcgta 
aagattgaat 
taagcatgta 
tagagtcccg 
ggataaatta 
t 



ggaatatttc 
ctgcgtcaat 
gcfcgtgcctg 
gaaggtactg 
caccgaatac 
tgaagagtat 
cgtcgtcggt 
cgttggcggt 
tctgctgcaa 
acaatgaatc 
ccgagctcga 
cctgttgccg 
ataattaaca 
caattataca 
tcgcgcgcgg 



gcgccactgg 
gtaatgttct 
aaccgttatt 
gaaaaagaac 
ggcgtggata 
cagtgtgcat 
gaacaggtat 
aacaagaagg 
aaacgctgga 
aacaactctc 
atttccccga 
gtcttgcgat 
tgfcaatgcat 
tttaatacgc 
tgtcatctat 



2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3451 



<210> 109 

<211> 14627 

<212> DNA 

<213> Artificial Sequence 
<220> 

<;223> pAglla Plasmid 



<400> 109 

catgccaacc 

atagtgcagt 

agtcctaagt 

gttttagtcg 

agagcgccgc 

ccaaccaacg 

ccggcaccag 

acgttgtgac 

ttgccgagcg 

acaccaccac 

agcgttccct 

tgaagtttgg 

tcgaccagga 

ccctgtaccg 

gtgccttccg 

gccaagagga 

cgaagagat c 

ctcaaccgtg 

gccggccagc 

tgagtaaaac 

aatacgcaag 

aagacgacca 

ttagtcgatt 

ccgctaaccg 

cggcgcgact 

atcaaggcag 

accgccgacc 

gcggcctttg 

gcgctggccg 

ccaggcactg 

cgcgaggtcc 

aagagaaaat 

gcaaggctgc 

agttgccggc 

ttaccgagct 

atgagtagat 

accgacgccg 

fcgggttgfcct 

cggtcgcaaa 

gaagttgaag 

tgaatcgtgg 



acagggttcc 
cggcttctga 
tacgcgacag 
cataaagtag 
cgctggcctg 
ggccgaactg 
gcgcgaccgc 
agtgaccagg 
catccaggag 
gccggccggc 
aatcafccgac 
cccccgccct 
aggccgcacc 
cgcacttgag 
tgaggacgca 
acaagcatga 
gaggcggaga 
cggctgcatg 
ttggccgctg 
agcttgcgtc 
gggaacgcat 
tcgcaaccca 
ccgatcccca 
ttgtcggcat 
tcgtagtgat 
ccgacttcgt 
tggtggagct 
tcgtgtcgcg 
ggtacgagct 
c cgc cgc egg 
aggegctgge 
gagcaaaagc 
aacgttggcc 
ggaggatcac 
gctatctgaa 
gaattttagc 
tggaatgece 
gccggccctg 
ccatccggcc 
gccgcgcagg 
caagcggccg 



cc tegggate 
cgttcagtgc 
gctgccgccc 
aatacttgcg 
ctgggctatg 
cacgcggccg 
ccggagctgg 
ctagaccgcc 
gccggcgcgg 
cgcatggtgt 
cgcacccgga 
accctcaccc 
gtgaaagagg 
egcagegagg 
ttgaccgagg 
aaccgcacca 
tgatcgegge 
aaatcctggc 
aagaaaccga 
atgeggtege 
gaaggttatc 
tctagcccgc 
gggcagtgcc 
cgaccgcccg 
egaeggageg 
gctgattccg 
ggttaagcag 
ggcgatcaaa 
gcccattctt 
cacaaccgtt 
cgctgaaatt 
acaaacacgc 
agcctggcag 
accaagctga 
tacatcgcgc 
ggcfcaaagga 
catgtgtgga 
caatggcact 
eggtacaaat 
ccgcccagcg 
ctgatcgaat 



aaagtacttt 
agccgtcttc 
tgcccttttc 
actagaaccg 
cccgcgtcag 
gctgcaccaa 
ecaggatget 
tggcccgcag 
gectgegtag 
tgaccgtgtt 
gegggegega 
eggcacagat 
cggctgcact 
aagtgacgcc 
ccgacgccct 
ggacggccag 
egggtaegtg 
cggttfcgtct 
gcgccgccgt 
tgcgtatatg 
gctgtactta 
gccctgcaac 
cgcgattggg 
acgattgacc 
ccccaggcgg 
gtgcagccaa 
cgcattgagg 
ggcacgcgca 
gagtccegta 
cfcfcgaafccag 
aaatcaaaac 
t aagtgccgg 
acacgccagc 
agatgtaege 
agctaccaga 
ggcggcatgg 
ggaacgggcg 
ggaaccccca 
cggcgcggcg 
gcaacgcatc 
ccgcaaagaa 



gatccaaccc 
tgaaaacgac 
ctggcgtttt 
gagacattac 
caccgacgac 
gctgttttcc 
tgaccaccta 
cacccgcgac 
cctggcagag 
cgccggcatt 
ggccgccaag 
cgcgcacgcc 
gcttggcgtg 
caccgaggcc 
ggcggccgc c 
gaegaacegt 
ttcgagccgc 
gatgecaage 
ctaaaaaggt 
atgegatgag 
accagaaagg 
tcgccggggc 
cggccgtgcg 
gcgacgtgaa 

eggacttgge 

gcccttacga 
teaeggatgg 
tcggcggtga 
tcacgcagcg 
aacccgaggg 
tcatttgagt 
ccgtccgagc 
catgaagegg 
ggtacgccaa 
gtaaatgagc 
aaaatcaaga 
gttggccagg 
ageccgagga 
ctgggtgatg 
gaggcagaag 
tcccggcaac 



ctccgctgct 
atgtcgcaca 
cttgtcgcgt 
gecatgaaca 
caggacttga 
gagaagatca 
cgccctggcg 
ctactggaca 
ccgtgggccg 
gecgagtteg 
gcccgaggcg 
cgcgagctga 
catcgctcga 
aggeggegeg 
gagaatgaac 
ttttcattac 
ccgcgcacgt 
tggcggcctg 
gatgtgtatt 
taaataaaca 
egggtcagge 
cgatgttctg 
ggaagatcaa 
ggccatcggc 
tgtgtccgcg 
catatgggee 
aaggctacaa 
ggttgccgag 
cgtgagctac 
cgacgctgcc 
taatgaggta 
gcacgcagca 
gtcaactttc 
ggcaagacca 
aaatgaafcaa 
acaaccaggc 
egtaagegge 
atcggcgtga 
acctggtgga 
cacgccccgg 
cgccggcagc 
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cggtgcgccg 
gatgcfccfcat 
tctgtcgaag 
cgtagaggtt 
gatggcggtt 
gcccggccgc 
tggcggaaag 
tgccatgcag 
agccttgatt 
gatcgagcta 
gacggttcac 
ggcacgccgc 
cagtggcagc 
aaatgacctg 
catgcgctac 
gatgctaggg 
tagcacgtac 
cccaaagccg 
aggcgafcttt 
ctgtgcataa 
gtcgctgcgc 
aaaaatggct 
acfccgaccgc 
aaaaccfccfcg 
ggagcagaca 
tgacccagtc 
gattgtactg 
ataccgcatc 
gctgcggcga 
ggataacgca 
ggccgcgttg 
acgctcaagt 
tggaagctcc 
ctttctccct 
ggtgtaggtc 
ctgcgcctta 
actggcagca 
gttcttgaag 
tctgctgaag 
caccgctggt 
atctcaagaa 
acgttaaggg 
atattttatt 
ctgttcttcc 
gtccgccctg 
gatgttgctg 
ctttaaaaaa 
gcaatccaca 
taagctattc 
cgcatacagc 
gacgccatcg 
gaccfctfcgga 
atcataggtg 
tcccaccagc 
tttttcgatc 
tcctcttttc 
aattcactgt 
ttttcaaagt 
caggcagcaa 
gtttcaaacc 
tctgccgcct 
cgagtggtga 
tatattgtgg 
taatgtactg 
gttttaggaa 
ggtttcttat 
ggaactactc 



tcgattagga 
gacgtgggca 
cgtgaccgac 
tccgcagggc 
tcccatctaa 
gtgttccgtc 
cagaaagacg 
cgtacgaaga 
agccgctaca 
gctgafctgga 
cccgattact 
gccgcaggca 
gccggagagt 
ccggagtacg 
cgcaacctga 
caaattgccc 
attgggaacc 
tacatfcggga 
tccgccfcaaa 
ctgtctggcc 
tccctacgcc 
ggcctacggc 
cggcgcccac 
acacatgcag 
agcccgtcag 
acgtagcgat 
agagtgcacc 
aggcgctcfct 
gcggtatcag 
ggaaagaaca 
ctggcgtttt 
cagaggfcggc 
ctcgtgcgct 
tcgggaagcg 
gttcgctcca 
tccggtaact 
gccactggta 
tggtggccta 
ccagttacct 
agcggtggtt 
gatcctttga 
attttggtca 
ttctcccaat 
ccgatatcct 
ccgcttctcc 
tctcccaggt 
tcatacagct 
tcggccagat 
gtatagggac 
tcgataatct 
gcctcactca 
acaggcagct 
gtccctttat 
ttatatacct 
agttttttca 
tacagtattt 
tccttgcatt 
tggcgtataa 
cgctctgtca 
cggcagctta 
tacaacggct 
ttttgtgccg 
tgtaaacaaa 
aattaacgcc 
ttagaaatfct 
atgctcaaca 
acacattatt 



agccgcccaa 
cccgcgatag 
gagctggcga 
cggccggcat 
ccgaatccat 
cacacgttgc 
acctggtaga 
aggccaagaa 
agatcgtaaa 
tgtaccgcga 
ttttgatcga 
aggcagaagc 
tcaagaagtt 
atfctgaagga 
t cgagggcga 
tagcagggga 
caaagccgta 
accggtcaca 
actctttaaa 
agcgcacagc 
ccgccgcttc 
caggcaatct 
atcaaggcac 
ctcccggaga 
ggcgcgtcag 
agcggagtgt 
atatgcggtg 
ccgcttcctc 
ctcactcaaa 
tgtgagcaaa 
tccataggct 
gaaacccgac 
ctcctgttcc 

tggcgctttc 

agctgggctg 
atcgtcttga 
acaggattag 
actacggcta 
tcggaaaaag 
tttttgtttg 
tcttttctac 
tgcattctag 
caggcttgat 
ccctgatcga 
caagatcaat 
cgccgtggga 
cgcgcggatc 
cgttattcag 
aatccgatat 
tttcagggcfc 
tgagcagatt 
ttccttccag 
accggctgtc 
tagcaggaga 
attccggtga 
aaagataccc 
ctaaaacctt 
catagtatcg 
tcgttacaat 
gfctgccgttc 
ctcccgctga 
agctgccggt 
ttgacgctta 
gaattaattc 
tattgataga 
catgagcgaa 
atggagaaac 



gggcgacgag 
tcgcagcatc 
ggtgatccgc 
ggccagtgtg 
gaaccgatac 
ggacgtactc 
aacctgcatt 
cggccgcctg 
gagcgaaacc 
gatcacagaa 
tcccggcatc 
cagatggttg 
ctgtttcacc 
ggaggcgggg 
agcatccgcc 
aaaaggtcga 
cattgggaac 
cafcgtaagtg 
acttattaaa 
cgaagagctg 
gcgtcggcct 
accagggcgc 
cctgcctcgc 
cggtcacagc 
cgggtgttgg 
atactggctt 
tgaaataccg 
gctcactgac 
ggcggtaata 
aggccagcaa 
ccgcccccct 
agga c t at aa 
gaccctgccg 
tcatagctca 
tgtgcacgaa 
gtccaacccg 
cagagcgagg 
cactagaagg 
agttggtagc 
caagcagcag 
ggggtctgac 
gtactaaaac 
ccccagtaag 
ccggacgcag 
aaagccactt 
aaagacaagt 
tttaaatgga 
taagtaatcc 
gtcgatggag 
ttgttcatct 
gctccagcca 
ccatagcatc 
cgtcattttt 
cattccttcc 
tattctcatt 
caagaagcta 
aaataccaga 
acggagccga 
caacatgcta 
ttccgaafcag 
cgccgtcccg 

cggggagctg 

gacaacttaa 
gggggatctg 
agtattttac 
accctatagg 
tcgagtcaaa 



caaccagatt 
atggacgtgg 
tacgagcttc 
tgggattacg 
cgggaaggga 
aagttctgcc 
cggttaaaca 
gtgacggtat 

gggcggccgg 

ggcaagaacc 
ggccgttttc 
ttcaagacga 
gtgcgcaagc 
caggctggcc 
ggttcctaat 
aaaggtctct 
cggaacccgt 
actgatataa 
actcttaaaa 
caaaaagcgc 
atcgcggccg 
ggacaagccg 
gcgtttcggt 
ttgtctgtaa 
cgggtgtcgg 
aactatgcgg 
c acagat gcg 
tcgctgcgct 
cggttatcca 
aaggccagga 
gacgagcatc 
agataccagg 
cttaccggat 
cgctgtaggt 
ccccccgttc 
gtaagacacg 
tatgtaggcg 
acagtatttg 
tcttgatccg 
attacgcgca 
gctcagtgga 
aattcatcca 
tcaaaaaata 
aaggcaatgt 
actttgccat 
tcctcttcgg 
gtgtcttctt 
aattcggcta 
tgaaagagcc 
tcatactctt 
tcatgccgtt 
atgtcctttt 
aaatataggt 
gtatctttta 
ttagccattt 
attataacaa 
aaacagcttt 
ttttgaaacc 
ccctccgcga 
catcggtaac 
gactgatggg 
ttggctggct 
taacacattg 
gattttagta 
aaatacaaat 
aaccctaatt 
tctcggtgac 



ttttcgttcc 
ccgttttccg 
cagacgggca 
acctggtact 
agggagacaa 
ggcgagccga 
ccacgcacgt 
ccgagggtga 
agtacatcga 
cggacgtgct 
tctaccgcct 
tctacgaacg 
tgatcgggtc 
cgatcctagt 
gtacggagca 
ttcctgtgga 
acattgggaa 
aagagaaaaa 
cccgcctggc 
ctacccttcg 
ctggccgctc 
cgccgtcgcc 
gatgacggtg 
gcggatgccg 
ggcgcagcca 
catcagagca 
taaggagaaa 
cggtcgttcg 
cagaatcagg 
accgtaaaaa 
acaaaaatcg 
cgtttccccc 
acctgtccgc 
atctcagttc 
agcccgaccg 
acttatcgcc 
gtgctacaga 
gtatctgcgc 
gcaaacaaac 
gaaaaaaagg 
acgaaaactc 
gtaaaatata 
gctcgacata 
cataccactt 
ctttcacaaa 
gcttttccgt 
cccagttttc 
agcggctgtc 
tgatgcactc 
ccgagcaaag 
caaagtgcag 
cccgttccac 
tttcattttc 
cgcagcggta 
attatttcct 
gacgaactcc 
ttcaaagttg 
gcggtgatca 
gatcatccgt 
atgagcaaag 
ctgcctgtat 
ggtggcagga 
cggacgtttt 
ctggattttg 
acatactaag 
cccttatctg 
gggcaggacc 
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4800 
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5700 
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5820 
5880 
5940 
6000 
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6300 
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ggacggggcg 
ccgtgcttga 
atgcgcacgc 
gcctccaggg 
cggggggaga 

gggcccgcgt 
cgctcccgca 
aagttgaccg 
gcctcggtgg 
gagatagatt 
ttccttatat 

agtggagata 

cacgatgctc 
aacgatagcc 
tgtccttttg 
taccctttgt 
cttggagtag 
agacgtggtt 
gggaccactg 
tttgtaggtg 
atggaatccg 
gtcttctgag 
gttggcaagc 
taatgcagct 
aatgtgagtt 
atgttgtgtg 
tacgaattcg 
gagtttggac 
gatgctattg 
gaactccagc 
t ccgaagc cc 
gtcctgctcc 
ccgcccccac 
cgtggacacg 
ggccagggtg 
gtcccggacc 
ggtccagaac 
caacttggcc 
gcaggaattc 
accaaagggc 
attgcccagc 
aatgccatca 
ccaaagatgg 
cttcaaagca 
agaatatcaa 
taatatcggg 
cagtagaaaa 
ttcaagatgc 
tggaaaaaga 
ctgacgtaag 
aagttcattt 
tctctcgagc 
cgacgtctgt 
tctcggaggg 
tgcgggtaaa 
catcggccgc 
cctattgcat 
tgcccgctgt 
gccagacgag 
gt gat t teat 
acaccgtcag 
gccccgaagt 
atggccgcat 
aggtcgecaa 
acttcgagcg 
gcattggtct 
gggcgcaggg 



gtaccggcag 
agccggccgc 
tcgggfccgtt 
acttcagcag 
cgtacacggt 
aggegatgee 
gaeggacgag 
tgcttgtctc 
caeggeggat 
tgtagagaga 
agaggaagg t 
tcacatcaat 
ctcgtgggtg 
tttcctttat 
atgaagtgac 
tgaaaagtct 
acgagagtgt 
ggaaegtett 
teggcagagg 
ccaccttcct 
aggaggtttc 
actgtatctt 
tgctctagcc 
ggcacgacag 
agctcactca 
gaattgtgag 
agecttgact 
aaaccacaac 
ctttatttgt 
atgagatccc 
aacctttcat 
tcggccacga 
ggctgctcgc 
acctccgacc 
ttgtccggca 
acaccggcga 
tcgaccgctc 
atggatccag 
gatcgacact 
tattgagact 
tatctgtcac 
ttgcgataaa 
acccccaccc 
agt gga t tga 
agatacagtc 
aaacctcctc 
ggaaggtggc 
ctctgccgac 
agaegttcca 
ggatgacgea 
catttggaga 
tttegcagat 
cgagaagttt 
cgaagaatct 
tagctgcgcc 
gctcccgatt 
ctcccgccgt 
tctacaaccg 
egggttegge 
atgegegatt 
tgcgtccgtc 
ccggcacctc 
aacageggtc 
catcttcttc 
gaggcatccg 
tgaccaact c 
tegatgegae 



gctgaagtcc 
ccgcagcatg 
gggcagcccg 
gtgggtgtag 
cgactcggcc 
ggcgacctcg 
gtcgtccgtc 
gatgtagtgg 
gtcggccggg 
gactggtgat 
ettgegaagg 
ccacttgctt 
ggggtccatc 
cgcaatgatg 
agatagctgg 
caatagccct 
cgtgctccac 
ctttttccac 
catcttgaac 
tttctactgt 
ccgatattac 
tgatattctt 
aatacgcaaa 
gtttcccgac 
ttaggcaccc 
eggataacaa 
agagggt cga 
tagaatgeag 
aaccattata 
cgcgctggag 
agaaggegge 
agtgcacgca 
egatcteggt 
acteggegta 
ccacctggtc 
agtcgtcctc 
cggcgacgtc 
atttegctea 
ctcgtctact 
tttcaacaaa 
ttcatcaaaa 
ggaaaggcta 
acgaggagca 
tgtgataaca 
t c agaagac c 
ggattccatt 
acctacaaat 
agtggtccca 
accacgtctt 
caatcccact 
ggacacgctg 

c c ggggggg c 

ctgatcgaaa 
cgtgctttca 
gatggtttct 
ccggaagtgc 
gcacagggtg 
gtcgeggagg 
ccattcggac 
gctgatcccc 
gcgcaggctc 
gtgeacgegg 
attgactgga 
tggaggccgt 
gagcttgeag 
tatcagagct 
gcaatcgtcc 



agetgecaga 
ccgcgggggg 
atgacagega 
agcgtggagc 
gtccagtcgt 
ccgtccacct 
cactcctgcg 
ttgacgatgg 
cgtcgttctg 
ttcagcgtgt 
atagtgggat 
tgaagacgtg 
tttgggacca 
gcatttgtag 
gcaatggaat 
ttggtcttct 
catgttatca 
gatgctcctc 
gatagecttt 
ccttttgatg 
cctttgttga 
ggagtagacg 
ccgcctctcc 
tggaaagegg 
caggctttac 
tttcacacag 
eggtatacag 
tgaaaaaaat 
agetgeaata 
gatcatccag 
ggtggaatcg 
gttgccggcc 
catggccggc 
cagctcgtcc 
ctggaccgcg 
cacgaagtcc 
gcgcgcggtg 
agttagtata 
ccaagaatat 
gggtaatatc 
ggacagtaga 
tegttcaaga 
tcgtggaaaa 

tggtggagca 

aaagggc t at 
gcccagctat 
gecatcattg 
aagatggacc 
caaagcaagt 
atccttcgca 
aaatcaccag 
aat gaga tat 
agttcgacag 
gcttcgatgt 
acaaaga t eg 
ttgacattgg 
teaegttgea 
ctatggatgc 
cgcaaggaat 
atgtgtatca 
tcgatgagct 
attteggetc 
gegaggegat 
ggttggcttg 
gatcgccacg 
tggttgacgg 
gatceggage 



aacccacgtc 
catatccgag 
ccacgctctt 
ccagt cccgt 
aggegt t gcg 
eggegacgag 
gttcctgegg 
tgcagaccgc 
ggctcatggt 
cctctccaaa 
tgtgcgtcat 
gttggaacgt 
ctgtcggcag 
gtgccacctt 
ccgaggaggt 
gagactgtat 
catcaatcca 
gtgggtgggg 
cctttatcgc 
aagtgacaga 
aaagtctcaa 
agagtgtcgt 
ccgcgcgttg 
gcagtgagcg 
actttatget 
gaaacagcta 
acatgataag 
gctttatttg 
aacaagttgg 
ccggcgtccc 
aaatctcgta 
gggtcgegea 
ccggaggcgt 
aggccgcgca 
ctgatgaaca 
egggagaace 
ageaceggaa 
aaaaagcagg 
caaagataca 
gggaaacctc 
aaaggaaggt 
tgcctctgcc 
agaagaegtt 
cgacactctc 
tgagactttt 
ctgtcacttc 
cgataaagga 
cccacccacg 
ggattgatgt 
agaccttcct 
tctctctcta 
gaaaaagect 
cgtctccgac 
aggagggegt 
ttatgtttat 
ggagtttagc 
agacctgcct 
gategctgeg 
eggtcaatae 
ctggcaaact 
gatgetttgg 
caacaatgtc 
gtteggggat 
tatggagcag 
actccgggcg 
caatttcgat 
cgggactgtc 



atgccagttc 
cgcctcgtgc 
gaagccctgt 
ccgctggtgg 
tgccttccag 
ccagggatag 
cteggtaegg 
cggcatgtcc 
agac t cgaga 
tgaaatgaac 
cccttacgtc 
cttctttttc 
aggcatcttg 
ccttttctac 
ttcccgatat 
ctttgatatt 
ettgetttga 
gtccatcttt 
aatgatggca 
tagctgggca 
tagecctttg 
gctccaccat 
gecgattcat 
caaegcaatt 
tccggctcgt 
tgaccatgat 
atacattgat 
tgaaatttgt 
ggtgggcgaa 
ggaaaacgat 
gcacgtgtca 
gggegaaetc 
cccggaagtt 
cccacaccca 
gggtcacgtc 
cgagccggtc 
cggcactggt 
cttcaatcct 
gtctcagaag 
ctcggattcc 
ggcacctaca 
gacagtggtc 
ccaaccacgt 
gtctactcca 
caacaaaggg 
atcaaaagga 
aaggctatcg 
aggagcatcg 
gatatctcca 
ctatataagg 
caaatctatc 
gaactcaccg 
ctgatgeage 
ggatatgtcc 
eggcactttg 
gagagectga 
gaaaccgaac 
gecgatctta 
actacatggc 
gtgatggacg 
gecgaggact 
ctgaeggaca 
tcccaatacg 
cagacgcgct 
tatatgetec 
gatgeagett 
gggegtacac 



6540 

6600 

6660 

6720 

6780 

6840 

6900 

6960 

7020 

7080 

7140 

7200 

7260 

7320 

7380 

7440 

7500 

7560 

7620 

7680 

7740 

7800 

7860 

7920 

7980 

8040 

8100 

8160 

8220 

8280 

8340 

8400 

8460 

8520 

8580 

8640 

8700 

8760 

8820 

8880 

8940 

9000 

9060 

9120 

9180 

9240 

9300 

9360 

9420 

9480 

9540 

9600 

9660 

9720 

9780 

9840 

9900 

9960 

10020 

10080 

10140 

10200 

10260 

10320 

10380 

10440 

10500 
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aaatcgcccg cagaagcgcg gccgtctgga 
gtggaaaccg acgccccagc actcgtccga 
atctgtcgat cgacaagctc gagtttctcc 
ggaattaggg ttcctatagg gtttcgctca 
gtatttgtat ttgtaaaata cttctatcaa 
agtactaaaa tccagatccc ccgaattaat 
gaatatcgcg agtaaactga aaatcacgga 
atatggcgag gaaaactgaa aaaggtggaa 
atatggcaag aaaactgaaa atcatggaaa 
atgacgaaat cactaaaaaa cgtgaaaaat 
attcgattgt gctagccaat gtttaacaag 
tggfccgtggc tggcggtggt ggaaaattgc 
ttggtgtttg cagcggtgtt tgatatcgga 
gcgtcatggt tattggtggt tggtcatcta 
cctatttttt acatattttt tattaaattt 
atcgtacttg ttttataaaa tattttatta 
tggaaattfct ctccattgtt ttttctatat 
tattatgtat tttttcgttt tataataaat 
atatatcatt tacaatgttt aaaagtcatt 
tttgtgcafct tggtgttgta catgtctatt 
tgtcacttgg gttttttttt ttaagacata 
gggaaacgac aatctgatca tgagcggaga 
atgacgcggg acaagccgtt ttacgtttgg 
actcagccgc gggtttctgg agtttaatga 
cgcgttcaaa agtcgcctaa ggtcactatc 
ccactgacgt tccataaatt cccctcggta 
ccaaataatc tgcaccggat ctcgagatcg 
tccccgggta cggtcagtcc cttatgttac 
aaaaactcga cggcctgfcgg gcattcagtc 
gttggtggga aagcgcgtta caagaaagcc 
atcagttcgc cgatgcagat attcgtaatt 
tctttatacc gaaaggttgg gcaggccagc 
attacggcaa agtgtgggtc aataatcagg 
catttgaagc cgatgtcacg ccgtatgtta 
gtgtgaacaa cgaactgaac tggcagacta 
acggcaagaa aaagcagtct tacttccatg 
gcgtaatgct ctacaccacg ccgaacacct 
tcgcgcaaga ctgtaaccac gcgtctgttg 
gcgttgaact gcgtgatgcg gatcaacagg 
ctttgcaagt ggtgaatccg cacctctggc 
acgtcacagc caaaagccag acagagfcgtg 
cagtggcagt gaagggcgaa cagttcctga 
ttggccgtca tgaagatgcg gatttgcgcg 
acgatcacgc attaatggac tggattgggg 
acgctgaaga gatgctcgac tgggcagatg 
cagctgtcgg ctttaacctc tctttaggca 
aactgtacag cgaagaggca gtcaacgggg 
aagagctgat agcgcgtgac aaaaaccacc 
aaccggatac ccgtccgcaa ggtgcacggg 
gtaaactcga tccgacgcgt ccgatcacct 
ccgataccat cagcgatctc tttgatgtgc 
tccaaagcgg cgatttggaa acggcagaga 
aggagaaact gcatcagccg attatcatca 
tgcactcaat gtacaccgac afcgtggagtg 
atcaccgcgt ctttgatcgc gtcagcgccg 
attttgcgac ctcgcaaggc atattgcgcg 
gcgaccgcaa accgaagtcg gcggcttfctc 
tcggtgaaaa accgcagcag ggaggcaaac 
cgtcggctac agcctcggga attgcgtacc 
ttggcaataa agtttcttaa gattgaatcc 
atttctgttg aattacgtta agcatgtaat 
gagatgggtt tttatgatta gagtcccgca 
aatatagcgc gcaaactagg ataaattatc 
ggaattcgat atcaagcttg gcactggccg 
ctggcgttac ccaacttaat cgccttgcag 
gcgaagaggc ccgcaccgat cgcccfctccc 
agagcagctt gagcttggat cagattgtcg 



-70- 

ccgatggctg tgtagaagta ctcgccgata 10560 
gggcaaagaa atagagtaga tgccgaccgg 1062 0 
ataataatgt gtgagtagtt cccagataag 10680 
tgtgttgagc atataagaaa cccttagtat 1074 0 
taaaatttct aattcctaaa accaaaatcc 10800 
tcggcgttaa ttcagatcaa gcttgacctg 10860 
aaatgagaaa tacacacttt aggacgtgaa 10920 
aatttagaaa tgtccactgt aggacgtgga 10980 
atgagaaaca tccacfctgac gacttgaaaa 1104 0 
gagaaatgca cactgaagga ctccgcggga 1110 0 
atgtcaagca caatgaatgt tggtggttgg 1116 0 
ggtggttcga gcggtagtga tcggcgatgg 1122 0 
atcacttatg gtggttgtca caatggaggt 1128 0 
tatattttta taataatatt aagtatttta 11340 
atgcattgtt tgtattttta aatagttttt 11400 
fctttatgtgt tatattatta cttgatgtat 11460 
ttataataat tttcttattt fcttfcttgtfct 11520 
atttattaaa aaaaatatta tttttgtaaa 11580 
tgtgaatata ttagctaagt tgtacttctt 1164 0 
atgattctct ggccaaaaca tgtctactcc 11700 
atcactagtg attatatcta gacfcgaaggc 1176 0 
attaagggag tcacgttatg acccccgccg 1182 0 
aactgacaga accgcaacgt tgaaggagcc 118 8 0 
gctaagcaca tacgtcagaa accattattg 1194 0 
agctagcaaa tatttcttgt caaaaatgct 12 00 0 
tccaattaga gtctcatatt cactctcaat 12 06 0 
aattcccgcg gccgcgaatt cactagtgga 1212 0 
gtcctgtaga aaccccaacc cgtgaaatca 12180 
tggatcgcga aaactgtgga attgagcagc 1224 0 
gggcaattgc tgtgccaggc agttttaacg 12300 
atgtgggcaa cgtctggtat cagcgcgaag 12 360 
gtatcgtgct gcgtttcgat gcggtcactc 12420 
aagtgatgga gcatcagggc ggctatacgc 124 8 0 
ttgccgggaa aagtgtacgt atcacagttt 1254 0 
tcccgccggg aatggtgatt accgacgaaa 1260 0 
atttctttaa ctacgccggg atccatcgca 12660 
gggtggacga tatcaccgtg gtgacgcatg 12720 
actggcaggt ggtggccaat ggtgatgtca 12780 
tggttgcaac tggacaaggc accagcggga 12 840 
aaccgggtga aggttatctc tatgaactgt 12900 
atatctaccc gctgcgcgtc ggcatccggt 12960 
tcaaccacaa accgttctac tttactggct 13020 
gcaaaggatt cgataacgtg ctgatggtgc 13 080 
ccaactccta ccgtacctcg cattaccctt 13140 
aacatggcat cgtggtgatt gatgaaactg 132 00 
ttggtttcga agcgggcaac aagccgaaag 13260 
aaactcagca ggcgcactta caggcgatta 13320 
caagcgtggt gafcgtggagt attgccaacg 133 80 
aatatttcgc gccactggcg gaagcaacgc 13440 
gcgtcaatgt aatgttctgc gacgctcaca 13500 
tgtgcctgaa ccgttattac ggttggtatg 13560 
aggtactgga aaaagaactt ctggcctggc 13 620 
ccgaatacgg cgtggatacg ttagccgggc 13 680 
aagagtatca gtgtgcatgg ctggatatgt 1374 0 
tcgtcggtga acaggtatgg aatttcgccg 13 800 
ttggcggtaa caagaagggg atcttcaccc 13 860 
tgctgcaaaa acgctggact ggcatgaact 1392 0 
aatgaatcaa caactctcct ggcgcaccat 13 980 
gagctcgaat ttccccgatc gttcaaacat 14 040 
tgttgccggt cttgcgatga ttatcatata 14100 
aattaacatg taatgcatga cgttatttat 14160 
attatacatt taatacgcga tagaaaacaa 14220 
gcgcgcggtg tcatctatgt tactagatcg 142 80 
tcgttttaca acgtcgtgac tgggaaaacc 1434 0 
cacatccccc tttcgccagc tggcgtaafca 14400 
aacagttgcg cagcctgaat ggcgaatgct 144 60 
tttcccgcct tcagtttaaa ctatcagtgt 14520 
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ttgacaggat atattggcgg gtaaacctaa gagaaaagag cgtttattag aataacggat 14580 
atttaaaagg gcgtgaaaag gtttatccgt tcgtccattt gtatgtg 14627 

<210> 110 
<211> 9080 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> pl8attBZeo (6XHS4) 2eGFP Plasmid 



<40O> 110 

cagttgccgg 

gtcatggccg 

tacagctcgt 

tcctggaccg 

tccacgaagt 

tcgcgcgcgg 

caagttagta 

agccccgcgg 

gtccctcccc 

ccccccgcat 

gggatcgctt 

acggggccgc 

acgtccctcc 

ctccccccgc 

acgggatcgc 

atacggggcc 

ttacgtccct 

cgctcccccc 

gcacgggatc 

ggatacgggg 

aattacgtcc 

ggcgctcccc 

tggcacggga 

ggggatacgg 

gtaattacgt 

ccggcgctcc 

ggtggcacgg 

tgggggatac 

atgtaattac 

gtccggcgct 

aaggtggcac 

cctgggggat 

tagttcatag 

gctgaccgcc 

cgccaatagg 

tggcagtaca 

aatggcccgc 

acatctacgt 

actctcccca 

ttttgtgcag 

cgaggggcgg 

ccgaaagttt 

gcggcgggcg 

ccgcccgccc 

ttctcctccg 

gcgtgaaagc 

gtgcgtgcgt 

tgagcgctgc 

gccgggggcg 

gtgtgtgcgt 

ctgcaccccc 

gggcgtggcg 

ggggcggggc 

ccggcggctg 
gggcgcaggg 



ccgggtcgcg 
gcccggaggc 
ccaggccgcg 
cgctgatgaa 
cccgggagaa 
tgagcaccgg 
taaaaaagca 
atccgctcac 
cgctaggggg 
ccccgagccg 
tcctctgaac 
ggatccgctc 
cccgctaggg 
atccccgagc 
tttcctctga 
gcggatccgc 
cccccgctag 
gcatccccga 
gctttcctct 
ccgcggatcc 
ctcccccgct 
ccgcatcccc 
tcgctttcct 
ggccgcggat 
ccctcccccg 
ccccgcatcc 
gatcgctttc 
ggggccgcgg 
gtccctcccc 
ccccccgcat 
gggat cgc 1 1 
acggggcggg 
cccatatatg 
caacgacccc 
gactttccat 
tcaagtgtat 
ctggcattat 
attagtcatc 
tctccccccc 
cgatgggggc 
ggcggggcga 
ccttttatgg 
ggagtcgctg 
cggctctgac 
ggctgtaatt 
cttaaagggc 
gtgtgtgtgc 
gggcgcggcg 
gtgccccgcg 

gggggggtga 

ctccccgagt 
cggggctcgc 
cgcctcgggc 
tcgaggcgcg 
acttcctttg 



cagggcgaac 
gtcccggaag 
cacccacacc 
cagggtcacg 
cccgagccgg 
aacggcactg 
ggcttcaatc 
ggggacagcc 
cagcagcgag 
gcagcgtgcg 
gcttctcgct 
acggggacag 
ggcagcagcg 
cggcagcgtg 
acgcttctcg 
tcacggggac 
ggggcagcag 
gccggcagcg 
gaacgcttct 
gctcacgggg 
agggggcagc 
gagccggcag 
ctgaacgctt 
ccgctcacgg 
ct agggggc a 
ccgagccggc 
ctctgaacgc 
atccgctcac 
cgctaggggg 
ccccgagccg 
tcctctgaac 
ggatccacta 
gagttccgcg 
cgcccattga 
tgacgtcaat 
catatgccaa 
gcccagtaca 
gctattacca 
ctccccaccc 

gggggggggg 

ggcggagagg 
cgaggcggcg 
cgttgccttc 
tgaccgcgtt 
agcgcttggt 
tccgggaggg 

gtggggagcg 
cggggctttg 
gtgcgggggg 
gcagggggtg 
tgctgagcac 
cgtgccgggc 
cggggagggc 
gcgagccgca 
tcccaaatct 



tcccgccccc 
ttcgtggaca 
caggccaggg 
tcgtcccgga 
tcggtccaga 
gtcaacttgg 
ctgcagagaa 
cccccccaaa 
ccgcccgggg 
gggacagccc 
gctctttgag 
ccccccccca 
age cgc c egg 
eggggacage 
ctgctctttg 
agcccccccc 
cgagccgccc 
tgeggggaca 
cgctgctctt 
acagcccccc 
agcgagccgc 
cgtgcgggga 
ctcgctgctc 
ggacagcccc 
gcagcgagcc 
agcgtgcggg 
ttctcgctgc 
ggggacagcc 
cagcagcgag 
gcagcgtgcg 
gcttctcgct 
gttattaata 
ttacataact 
cgtcaataat 

gggtggacta 

gtacgccccc 
tgaccttatg 
tgggtcgagg 
ccaattttgt 
ggggcgcgcg 
tgeggeggea 
geggeggegg 
gccccgtgcc 
actcccacag 
ttaatgaegg 
ccctttgtgc 
ccgcgtgcgg 
tgcgctccgc 
getgegaggg 
tgggegegge 
ggcccggctt 

ggggggtgg 0 
tegggggagg 

gccattgcct 
ggcggagccg 



acggctgctc 
cgacctccga 
tgttgtccgg 
ccacaccggc 
actcgaccgc 
ccatggatcc 
gcttgatatc 
gcccccaggg 
ctccgctccg 
gggcacgggg 
cctgcagaca 
aagcccccag 
ggctccgctc 
ccgggcacgg 
agectgeaga 
caaagccccc 
ggggctccgc 
gcccgggcac 
tgagectgea 
cccaaagccc 
ccggggctcc 
cagcccgggc 
tttgagcctg 
cccccaaagc 
gcccggggct 
gacagcccgg 
tetttgagee 
cccccccaaa 
ccgcccgggg 
gggacagccc 
gctctttgag 
gtaatcaatt 
tacggtaaat 
gacgtatgtt 
tttacggtaa 
tattgaegtc 
ggactttcct 
tgagccccac 
atttatttat 
ecaggegggg 
gecaatcaga 
ccctataaaa 
ccgctccgcg 
gtgageggge 
ctegtttett 

gggggggagc 

cccgcgctgc 
gtgtgcgcga 
gaacaaaggc 
ggtcgggctg 
cgggtgcggg 
ggcaggtggg 
ggegeggegg 
tttatggtaa 
aaatctggga 



gccgatctcg 
ccactcggcg 
caccacctgg 
gaagtegtec 
tccggcgacg 
agatttcget 
gaattcctgc 
atgtaattac 
gtccggcgct 
aaggtggcac 
cctgggggat 
ggatgtaatt 
cggtccggcg 
ggaaggtggc 
cacctggggg 
agggatgtaa 
tccggtccgg 
ggggaaggtg 
gacacctggg 
ccagggatgt 
gctccggtcc 
aeggggaagg 
cagacacctg 
ccccagggat 
ccgctccggt 
geaeggggaa 
tgcagacacc 
gcccccaggg 
ctccgctccg 
gggcacgggg 
cctgcagaca 
aeggggt cat 
ggcccgcctg 
cccatagtaa 
actgcccact 
aatgacggta 
acttggcagt 
gttctgette 
tttttaatta 
eggggegggg 
gcggcgcgct 
agegaagege 
ccgcctcgcg 
gggacggccc 
ttctgtggct 
ggctcggggg 
ccggcggctg 
ggggagcgcg 
tgcgtgcggg 
taaccccccc 
gctccgtgcg 
ggtgccgggc 
ccccggagcg 
tegtgegaga 
ggcgccgccg 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

144 0 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

222 0 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 
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caccccctct 
gagggccttc 
cgcaggggga 
tgaccggcgg 
tcctgggcaa 
tggtgagcaa 
gcgacgtaaa 
gcaagctgac 
tcgtgaccac 
agcacgactt 
tcaaggacga 
tgaaccgcat 
age t ggag t a 
gcatcaaggt 
accactacca 
acctgagcac 
tgctggagtt 
aattcactcc 
tggctcacaa 
agccccttga 
gttggaattt 
atcagaatga 
caaaggtggc 
ttccatagaa 
ttttctttaa 
tcctgactac 
cccaagcttg 
gtctgeagge 
cccgtgcccg 
ggageggage 
cc tgggggct 
gtgtctgcag 
tccccgtgcc 
ccggagcgga 
tccctggggg 
aggtgtctgc 
cttccccgtg 
gaeeggageg 
catccctggg 
ccaggtgtct 
accttccccg 
eggaceggag 
tacatccctg 
ccccaggtgt 
ccaccttccc 
gccggaccgg 
attacatccc 
tcccccaggt 
tgccaccttc 
gcgccggacc 
taattacatc 
caggaattcg 
tccacacaac 
ctaactcaca 
ccagctgcat 
ttccgcttcc 
agctcactca 
catgtgagca 
tttccatagg 
gcgaaacccg 
ctctcctgtt 
cgtggcgctt 
caagctgggc 
etategtett 
taacaggatt 
taactaegge 
etteggaaaa 



agegggegeg 
gtgcgtcgcc 
cggctgcctt 
ctctagagcc 
cgtgctggtt 
gggegaggag 
cggccacaag 
cctgaagttc 
cctgacctac 
cttcaagtcc 
cggcaactac 
cgagctgaag 
caactacaac 
gaacttcaag 
gcagaacacc 
ccagtccgcc 
cgtgaccgcc 
tcaggtgcag 
ataccactga 
gcatctgact 
tttgtgtctc 
gtatttggtt 
tat aaagagg 
aagecttgae 
catccctaaa 
tcccagtcat 
catgcctgca 
tcaaagagca 
ggctgtcccc 
cccgggcggc 
ttgggggggg 
gctcaaagag 
cgggctgtcc 
gccccgggcg 
ctttgggggg 
aggctcaaag 
cccgggctgt 
gagccccggg 
ggctttgggg 
gcaggctcaa 
tgcccgggct 
cggagccccg 
ggggctttgg 
ctgcaggctc 
cgtgcccggg 
agcggagccc 
tgggggct tt 
gtctgeagge 
cccgtgcccg 
ggageggage 
cctgggggct 
taatcatggt 
atacgagecg 
ttaattgcgt 
taatgaatcg 
tcgctcactg 
aaggcggtaa 
aaaggecage 
ctccgccccc 
acaggactat 
ccgaccctgc 
tctcatagct 
tgtgtgcacg 
gagtccaacc 
ageagagega 
tacactagaa 
agagttggta 



ggegaagegg 
gcgccgccgt 
egggggggae 
tctgctaacc 
gttgtgctgt 
ctgttcaccg 
ttcagcgtgt 
atctgcacca 
ggcgtgcagt 
gccatgcccg 
aagacccgcg 
ggcatcgact 
agccacaacg 
atccgccaca 
cccatcggcg 
ctgagcaaag 
geegggatea 
gctgcctatc 
gatctttttc 
tctggctaat 
teacteggaa 
tagagtttgg 
tcatcagtat 
ttgaggttag 
attttcctta 
agctgtccct 
ggtcgactct 
gcgagaagcg 
gcacgctgcc 
tegetgetge 
gctgtccccg 
cagegagaag 
ccgcacgctg 
gctcgctgct 
gggctgtccc 
ageagegaga 
ccccgcacgc 
cggctcgctg 

gggggctgtc 

agagcagega 
gtccccgcac 
ggcggctcgc 

gggggggctg 

aaagagcagc 
ctgtccccgc 
cgggcggctc 

gggggggggc 

tcaaagagca 
ggctgtcccc 
cccgggcggc 
ttgggggggg 
catagctgtt 
gaagcataaa 
tgcgctcact 
gccaacgcgc 
actcgctgcg 
tacggttatc 
aaaaggccag 
ctgacgagca 
aaagatacca 
cgcttaccgg 
caegctgtag 
aaccccccgt 
eggtaagaca 
ggtatgtagg 
ggacagtatt 
gctcttgatc 



tgcggcgccg 
ccccttctcc 
ggggcagggc 
atgttcatgc 
ctcatcattt 
gggtggtgcc 
ccggcgaggg 
ccggcaagct 
gcttcagccg 
aaggctacgt 
ccgaggtgaa 
tcaaggagga 
tctatatcat 
acatcgagga 
acggccccgt 
accccaacga 
ctctcggcat 
agaaggtggt 
cctctgccaa 
aaaggaaatt 
ggacatatgg 
caacatatgc 
atgaaacagc 
atttttttta 
catgttttac 
cttctcttat 
agtggatccc 
ttcagaggaa 
ggctcgggga 
cccctagcgg 
tgageggate 
cgttcagagg 
ccggctcggg 
gccccctagc 
cgtgagcgga 
agegttcaga 
tgccggctcg 
ctgcccccta 
cccgtgagcg 
gaagcgttca 
gctgccggct 
tgctgccccc 
tccccgtgag 
gagaagegtt 
acgctgccgg 
gctgctgccc 
tgtccccgtg 
gcgagaagcg 
gcacgctgcc 
tcgctgctgc 
gctgtccccg 
tcctgtgtga 
gtgtaaagcc 
gcccgctttc 
ggggagaggc 
cteggtegtt 
cacagaatca 
gaaccgtaaa 
tcacaaaaat 
ggcgtttccc 
atacctgtcc 
gtatctcagt 
tcagcccgac 
cgacttatcg 
eggtgetaca 
tggtatctgc 
eggcaaacaa 



gcaggaagga 
atctccagcc 

ggggttcggc 

cttcttcttt 
tggcaaagaa 
catcctggtc 
egagggegat 
gcccgtgccc 
ctaccccgac 
ccaggagcgc 
gttcgagggc 
cggcaacatc 
ggccgacaag 
eggcagegtg 
gctgctgccc 
gaagegegat 
ggacgagctg 
ggctggtgtg 
aaattatggg 
tattttcatt 
gagggcaaat 
catatgetgg 
cccctgctgt 
tattttgttt 
tagecagatt 
gaagatccct 
ccgccccgta 
agcgatcccg 
tgcgggggga 
gggagggacg 

cgcggccccg 
aaagegatec 
gatgeggggg 

gggggaggga 

tccgcggccc 
ggaaagegat 
gggatgcggg 
gegggggagg 
gatccgcggc 
gaggaaagcg 
eggggatgeg 
tageggggga 
cggatccgcg 
cagaggaaag 
c t egggga t g 
c c t agegggg 
ageggatccg 
ttcagaggaa 
ggctcgggga 
cccctagcgg 
tgageggate 
aattgttatc 
tggggtgcct 

cagtegggaa 
ggtttgcgta 
cggctgcggc 
ggggataacg 
aaggccgcgt 
cgacgctcaa 
cctggaagct 
gcctttctcc 
tcggtgtagg 
cgctgcgcct 
ccactggcag 
gagttcttga 
getctgetga 
accaccgctg 



aatgggcggg 
teggggctge 
ttctggcgtg 
ttcctacagc 
ttcgccacca 
gagctggacg 
gccacctacg 
tggcccaccc 
cacatgaagc 
accatcttct 
gacaccctgg 
ctggggcaca 
cagaagaacg 
cagctcgccg 
gacaaccact 
cacatggtcc 
tacaagtaag 
gccaatgccc 
gacatcatga 
gcaatagtgt 
catttaaaac 
ctgccatgaa 
ccattcctta 
tgtgttattt 
tttcctcctc 
cgacctgcag 
tcccccaggt 
tgccaccttc 
gcgccggacc 
taattacatc 
tatcccccag 
cgtgccacct 
gagegcegga 
cgtaattaca 
cgtatccccc 
cccgtgccac 
gggagcgccg 
gaegtaatta 
cccgtatccc 
atcccgtgcc 

gggggagege 

gggacgtaat 
gccccgtatc 
cgatcccgtg 
eggggggage 
gaggga eg t a 
cggccccgta 
agcgatcccg 
tgcgggggga 
gggagggacg 
cgcggggctg 
cgctcacaat 
aatgagtgag 
acctgtcgtg 
ttgggcgctc 
gageggtate 
caggaaagaa 
tgctggcgtt 
gtcagaggtg 
ccctcgtgcg 
ettegggaag 
tcgttcgctc 
tatceggtaa 
cagccactgg 
agtggtggcc 
agecagttae 
gtagcggtgg 



3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 
5100 
5160 
5220 
5280 
5340 
5400 
5460 
5520 
5580 
5640 
5700 
5760 
5820 
5880 
5940 
6000 
6060 
6120 
6180 
6240 
6300 
6360 
6420 
6480 
6540 
6600 
6660 
6720 
6780 
6840 
6900 
G960 
7020 
7080 
7140 
7200 
7260 
7320 
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tttttttgtt 
gatcttttct 
catgagatta 
afccaatcfcaa 
ggcacctatc 
gtagataact 
agacccacgc 
gcgcagaagt 
agctagagta 
catcgtggtg 
aaggcgagtt 
gatcgttgtc 
taattctctt 
caagtcattc 
ggataatacc 

ggggcgaaaa 

tgcacccaac 
aggaaggcaa 
actcttcctt 
catatttgaa 
agtgccacct 
aagcgccatt 
tcgctattac 
ccagggtttt 
aaggccttga 
acaaaccaca 
tgctttattt 
gcatgagatc 
ccaacctttc 
cctcggccac 



tgcaagcagc 
acggggtctg 
tcaaaaagga 
agtatatatg 
tcagcgatct 
acgatacggg 
tcaccggctc 
ggtcctgcaa 
agtagttcgc 
tcacgctcgt 
acatgatccc 
agaagtaagt 
actgtcatgc 
tgagaatagt 
gcgccacata 
ctctcaagga 
tgatcttcag 
aafcgccgcaa 
tttcaatatt 
tgtatttaga 
gacgtagtta 
cgccattcag 
gccagctggc 
cccagtcacg 
ctagagggtc 
actagaatgc 
gtaaccatta 
cccgcgctgg 
atagaaggcg 
gaagtgcacg 



agattacgcg 
acgctcagtg 
tcfctcaccta 
agtaaacttg 
gtctatttcg 
agggcttacc 
cagatttatc 
ctttatccgc 
cagttaatag 
cgtfctggtat 
ccatgttgtg 
tggccgcagt 
catccgtaag 
gtatgcggcg 
gcagaacttt 
tcttaccgct 
catcttttac 
aaaagggaat 
attgaagcat 
aaaataaaca 
acaaaaaaaa 
gctgcgcaac 
gaaaggggga 
acgttgtaaa 
gacggtatac 
agtgaaaaaa 
taagctgcaa 
aggatcatcc 
gcggtggaat 



cagaaaaaaa 
gaacgaaaac 
gatcctttta 
gtctgacagfc 
ttcatccata 
atctggcccc 
agcaataaac 
ctccatccag 
tttgcgcaac 
ggcttcattc 
caaaaaagcg 
gttatcactc 
atgcttttct 
accgagttgc 
aaaagtgctc 
gfcfcgagatcc 
tttcaccagc 
aagggcgaca 
ttatcagggt 
aataggggtt 
gcccgccgaa 
tgttgggaag 
tgtgctgcaa 
acgacggcca 
agacatgata 
atgctttatt 
taaacaagtt 
agccggcgtc 
cgaaatctcg 



ggatctcaag 
tcacgttaag 
aattaaaaat 
taccaatgct 
gttgcctgac 
agtgctgcaa 
cagccagccg 
tctattaatt 
gttgttgcca 
agctccggtfc 
gtfcagctccfc 
atggttatgg 
gtgactggtg 
tcttgcccgg 
atcattggaa 
agttcgatgt 
gtttctgggt 
cggaaatgtt 
tattgtctca 
ccgcgcacat 
gcgggcttta 
ggcgatcggt 
ggcgattaag 
gtccgtaata 
agatacattg 
tgtgaaattt 
ggggtgggcg 
ccggaaaacg 
tagcacgtgt 



aagatccttt 
ggattttggt 
gaagttttaa 
taatcagtga 
tccccgtcgt 
tgataccgcg 
gaagggccga 
gttgccggga 
ttgctacagg 
cccaacgatc 
tcggtcctcc 
cagcactgca 
agtactcaac 
cgtcaatacg 
aacgttcttc 
aacccactcg 
gagcaaaaac 
gaatactcat 
tgagcggata 
ttccccgaaa 
ttaccaagcg 
gcgggcctct 
1 1 ~ggg t aacg 
cgactcactt 
atgagtttgg 
gtgatgctat 
aagaactcca 
attccgaagc 
cagtcctgct 



7380 
7440 
7500 
7560 
7620 
7680 
7740 
7800 
7860 
7920 
7980 
8040 
8100 
8160 
8220 
8280 
8340 
8400 
8460 
8520 
8580 
8640 
8700 
8760 
8820 
8880 
8940 
9000 
9060 
9080 



<210> 111 

<211> 4223 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pLIT3 8attBBSRpolyA10 Plasmid 



<400> 111 

gttaactacg 

ttfccfcaaata 

ataatattga 

ttttgcggca 

tgctgaagat 

gatccttgag 

gctatgtggc 

acactattct 

tggcatgaca 

caacttactt 

gggggatcat 

cgacgagcgt 
tggcgaacta 
agttgcagga 
t ggagc cggt 
ctcccgtatc 
acagatcgct 
ctcatatata 
aagattgtat 
aatttttgtt 
aaatcaaaag 
ctattaaaga 
ccactacgtg 
aat cggaacc 
gaaaggaagg 
cgctgcgcgfc 
atctaggtga 



tcaggtggca 
cattcaaata 
aaaaggaaga 
ttttgccttc 
cagttgggtg 
agttttcgcc 
gcggtattat 
cagaatgact 
gtaagagaat 
ctgacaacga 
gtaactcgcc 
gacaccacga 
cttactctag 
ccacttctgc 
gagcgtgggt 
gfcagtfcafcct 
gagataggtg 
ctttagattg 
aagcaaatat 
aaatcagctc 
aatagcccga 
acgtggactc 
aaccatcacc 
ctaaagggag 
gaagaaagcg 
aaccaccaca 
agatcctttt 



ctfcfcfccgggg 
tgtatccgct 
gtatgagtat 
ctgtttttgc 
cacgagtggg 
ccgaagaacg 
cccgtgttga 
tggttgagta 
tatgcagtgc 
tcggaggac c 
ttgatcgttg 
tgcctgtagc 
cttcccggca 
gctcggccct 
ctcgcggtat 
acacgacggg 
cctcactgat 
atttaccccg 
ttaaattgta 
attttttaac 
gatagggttg 
caacgtcaaa 
caaatcaagt 
cccccgattt 
aaaggagcgg 
cccgccgcgc 
tgataatctc 



aaatgtgcgc 
catgagacaa 
tcaacatttc 
tcacccagaa 
fcfcacafccgaa 
ttctccaatg 
cgc cgggc aa 
ctcaccagtc 
tgccataacc 
gaaggagcta 
ggaaccggag 
aatggcaaca 
acaattaata 
tccggctggc 
cattgcagca 
gagt caggca 
taagcattgg 
gttgataatc 
aacgttaata 
caataggccg 
agtgttgttc 
gggcgaaaaa 
tttttggggt 
agagcttgac 
gcgctagggc 
ttaatgcgcc 
atgaccaaaa 



ggaaccccta 
taaccctgat 
cgtgtcgccc 
acgctggtga 
ctggatctca 
atgagcactt 
gagcaactcg 
acagaaaagc 
afcgagfcgata 
accgcttttt 
ctgaatgaag 
acgttgcgca 
gactggatgg 
tggtttafctg 
ctggggccag 
actatggatg 
taactgtcag 
agaaaagc c c 
ttttgttaaa 
aaatcggcaa 
cagtttggaa 
ccgtctatca 

cgaggtgccg 

ggggaaagcg 
gctggcaagt 
gctacagggc 
tcccttaacg 



tttgtttatt 
aaatgcttca 
ttattccctt 
aagtaaaaga 
acagcggtaa 
ttaaagttct 
gtcgccgcat 
atcttacgga 
acactgcggc 
tgcacaacat 
ccataccaaa 
aactattaac 
aggcggataa 
ctgataaatc 
atggtaagcc 
aacgaaatag 
accaagttta 
caaaaacagg 
attcgcgtta 
aatcccttat 
caagagtcca 
gggcgatggc 
taaagcacta 
aacgtggcga 
gtagcggtca 
gcgtaaaagg 
tgagttttcg 



60 

12 0 

180 

240 

300 

360 

420 

480 

540 

600 

660 

72 0 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

15 6 0 

1620 
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ttccactgag 
ctgcgcgtaa 
ccggatcaag 
ccaaatactg 
ccgcctacat 
tcgtgtctta 
fcgaacggggg 
tacctacagc 
tatccggtaa 
gcctggtatc 
tgatgctcgt 
ttcctggcct 
accccaggct 
acaatttcac 
ctagtggggc 
fcgctttttta 
aacaagatct 
ataaacatca 
atattgaagc 
cagtttcgaa 
acgaagtaga 
cagactatgc 
cgattgaaga 
ttggctgctg 
gcatccagga 
gaggcacgat 
cataattgga 
gtgtataatg 
gaactgatga 
aagaaatgcc 
aaaagaagag 
gtcatgctgfc 
aagctgcact 
ataacagtta 
ctattaataa 
ataaggaata 
gtagaggttt 
atgaatgcaa 
aatagcatca 
gtctccggat 
cggactggcc 
tcgccttgca 
tcgcccttcc 
cccgcttcgg 



cgtcagaccc 
tcfcgcfcgctt 
agctaccaac 
ttcttctagt 
acctcgctct 
ccgggfctgga 
gttcgtgcac 
gtgagctatg 
gcggcagggt 
tttatagtcc 
caggggggcg 
tttgctggcc 
ttacacttta 
acaggaaaca 
ccgtgcaatt 
tactaacttg 
agaattagta 
tgtgggagcg 
gtatatagga 
tggacaaaag 
tagaagtatt 
accagattgt 
actcattcca 
cctgaggctg 
aaccagcagc 
ggccgctttg 
caaactacct 
tgttaaacta 
atgggagcag 
atctagtgat 
aaaggtagaa 
gtttagtaat 
gctatacaag 
taatcataac 
ctatgctcaa 
tttgatgtat 
tacttgcttt 
ttgttgttgt 
caaatttcac 
gtacaggcat 
gtcgttttac 
gcacatcccc 
caacagttgc 
cgggcttttt 



cgtagaaaag 
gcaaacaaaa 
tctttttccg 
gtagccgtag 
gctaatcctg 
ctcaagacga 
acagcccagc 
agaaagcgcc 
cggaacagga 
tgtcgggttt 
gagcctatgg 
ttttgctcac 
tgcttccggc 
gctatgacca 
gaagccggct 
agcgaaatct 
gaagtagcga 
gcaattcgta 
cgagtaactg 
gattttgaca 
cgagtggtaa 
tttgtgttaa 
ctcaaatata 
gacgacctcg 
ggctatccgc 
gtccggatct 
acagagattt 
ctgattctaa 

tggtggaatg 

gatgaggcta 
gaccccaagg 
agaactcttg 
aaaattatgg 
atacfcgtfcfcfc 
aaattgtgta 
agtgccttga 
aaaaaaccfcc 
taacttgttt 
aaataaagat 
gcgtcgaccc 
aacgtcgtga 
ctttcgccag 
gcagcctgaa 
ctt 



atcaaaggat 
aaaccaccgc 
aaggtaacfcg 
tfcaggccacc 
ttaccagtgg 
tagttaccgg 
ttggagcgaa 
acgcttcccg 
gagcgcacga 
cgccacctct 
aaaaacgcca 
atgtaatgtg 
tcgtatgttg 
tgattacgcc 
ggcgccaagc 
ggatcaccat 
cagagaagat 
cgaaaacagg 
tttgtgcaga 
cgattgtagc 
gtccttgtgg 
tagaaatgaa 
cccgaaatta 
cggagt t c t a 
gcatccatgc 
ttgtgaagga 
aaagctctaa 
fctgttfcgtgfc 
cctttaatga 
cbgctgactc 
actttccttc 
cttgctttgc 
aaaaatattc 
ttcttactcc 
cctttagctt 
cfcagagatca 
ccacacctcc 
attgcagctfc 
ccacgaattc 
tctagtcaag 
ctgggaaaac 
ctggcgtaat 
tggcgaatgg 



cttcttgaga 
fcaccagcggt 
gcttcagcag 
acttcaagaa 
ctgctgccag 
ataaggcgca 
cgacctacac 
aagggagaaa 

gggagcttcc 

gacttgagcg 
gcaacgcggc 
agttagctca 
tgtggaattg 
aagctacgta 
ttctctgcag 
gaaaacattt 
tacaatgctt 
agaaatcatt 
agccattgcg 
tgttagacac 
tatgtgtagg 
tggcaagtta 
aaagttttac 
ccggcagtgc 
ccccgaactg 
accttacttc 
ggtaaatata 
attttagatt 
ggaaaacctg 
tcaacattct 
agaattgcta 
tatttacacc 
tgtaaccttt 
acacaggcat 
tttaatttgt 
taatcagcca 
ccctgaacct 
ataatggtta 
gctagcttcg 
gccttaagtg 
cctggcgtta 
agcgaagagg 
cgcttcgctt 



tccttttttt 
ggtttgfcfctg 
agcgcagat a 
ctctgtagca 
tggcgataag 
gcggtcgggc 
cgaactgaga 
ggcggacagg 
agggggaaac 
tcgatttttg 
ctttttacgg 
ctcattaggc 
tgagcggata 
atacgactca 
gattgaagcc 
aacatttctc 
tatgaggata 
tcggcagtac 
attggtagtg 
ccttattctg 
gagttgattt 
gtcaaaacta 
cataccaagc 
aaatccgtcg 
caggagtggg 

tgtggtgtga 

aaatttttaa 
ccaacctatg 
ttttgctcag 
actcctccaa 
agttttttga 
acaaaggaaa 
ataagtaggc 
agagtgtctg 
aaaggggtta 
taccacattt 
gaaacataaa 
caaataaagc 
gccgtgacgc 
agtcgtatta 
cccaacttaa 
cccgcaccga 
ggtaataaag 



3.680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4223 



<210> 112 
<211> 5855 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pCX-LamlntR Plasmid 



<400> 112 

gtcgacattg 

gcccatatat 

ccaacgaccc 

ggactttcca 

atcaagtgta 

cctggcatta 

tattagtcafc 

atctcccccc 

gcgatggggg 

gggcggggcg 

tccttttatg 

gggagtcgct 

ccggctctga 



attattgact 
ggagttccgc 
ccgcccattg 
ttgacgtcaa 
tcatatgcca 
tgcccagtac 
cgctattacc 
cctccccacc 
cggggggggg 
aggcggagag 
gcgaggcggc 
gcgttgcctt 
ctgaccgcgt 



agttattaat 
gttacataac 
acgtcaataa 

tgggtggact 

agtacgcccc 
atgaccttat 
atgggtcgag 
cccaattttg 

gggggcgcgc 

gtgcggcggc 
ggcggcggcg 
cgccccgtgc 
tactcccaca 



agtaatcaat 
ttacggtaaa 
tgacgtatgt 
atttacggta 
ctattgacgt 
gggactttcc 
gtgagcccca 
tatttattta 
gccaggcggg 
agccaatcag 
gccctataaa 
cccgctccgc 
ggtgagcggg 



tacggggtca 
tggcccgcct 
tcccatagta 
aactgcccac 
caatgacggt 
tacttggcag 
cgttctgctt 
ttttttaatt 
gcggggcggg 
agcggcgcgc 
aagcgaagcg 
gccgcctcgc 
cgggacggcc 



ttagttcata 
ggctgaccgc 
acgccaatag 
ttggcagtac 
aaatggcccg 
tacatctacg 
cactctcccc 
attttgtgca 
gcgaggggcg 
tccgaaagtt 
cgcggcgggc 
gccgcccgcc 
cttctcctcc 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 
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gggctgtaat tagcgcttgg tttaatgacg 
ccttaaaggg ctccgggagg gccctttgtg 
tgfcgtgtgtg cgtggggagc gccgcgtgcg 
cgggcgcggc gcggggcttt gtgcgctccg 
ggtgccccgc ggtgcggggg ggctgcgagg 
tgggggggtg agcagggggt gtgggcgcgg 
cctccccgag ttgctgagca cggcccggct 
gcggggctcg ccgtgccggg cggggggtgg 
ccgcctcggg ccggggaggg ctcgggggag 
gtcgaggcgc ggcgagccgc agccattgcc 
gacttccttt gtcccaaatc tggcggagcc 
tagcgggcgc gggcgaagcg gtgcggcgcc 
cgtgcgtcgc cgcgccgccg tccccttctc 
acggctgcct fccggggggga cggggcaggg 
gctctagagc ctctgctaac catgttcatg 
acgtgctggt tgttgtgctg tctcatcatt 
gtcatgagcg ccgggattta ccccctaacc 
acagggaccc aaggacgggt aaagagtttg 
ctgaagctat acaggccaac attgagttat 
cgagaatcaa cagtgataat fcccgttacgt 
tcctggccag cagaggaatc aagcagaaga 
caataaggag gggtctgcct gatgctccac 
caatgctcaa tggatacata gacgagggca 
cactgagcga tgcattccga gaggcaatag 
ctgccactcg cgcagcaaaa tctagagtaa 
tgaaaattta tcaagcagca gaatcatcac 
ctgttgttac cgggcaacga gttggfcgatt 
atggatatct ttatgtcgag caaagcaaaa 
tgcatattga tgctctcgga atatcaatga 
ttggcggaga aaccataatt gcafccfcactc 
caaggtattt tatgcgcgca cgaaaagcat 
cctttcacga gttgcgcagt ttgtctgcaa 
ttgctcaaca tcttctcggg cataagtcgg 
gaggcaggga gtgggacaaa attgaaatca 
cctatcagaa ggtggtggct ggtgtggcca 
tttttccctc tgccaaaaat tatggggaca 
gctaataaag gaaatttatt ttcattgcaa 
tcggaaggac atatgggagg gcaaatcatt 
gtttggcaac atatgccata tgctggctgc 
cagtatatga aacagccccc tgctgtccat 
ggttagattt tttttatatt ttgttttgtg 
tccttacatg ttttactagc cagatttttc 
gtccctcttc tcttatgaag atccctcgac 
atagctgttt cctgtgtgaa attgttatcc 
aagcataaag tgtaaagcct ggggtgccta 
gcgctcactg cccgctttcc agtcgggaaa 
tagtcagcaa ccatagtccc gcccctaact 
tccgcccatt ctccgcccca tggctgacta 
gcctcggcct ctgagctafct ccagaagtag 
tgcaaaaagc taacttgttt atfcgcagctt 
caaatttcac aaataaagca tttttttcac 
tcaatgtatc ttatcatgtc tggatccgct 
aggcggtttg cgtattgggc gctcttccgc 
cgttcggctg cggcgagcgg tatcagctca 
atcaggggat aacgcaggaa agaacatgtg 
taaaaaggcc gcgttgctgg cgtttttcca 
aaatcgacgc tcaagtcaga ggtggcgaaa 
tccccctgga agctccctcg tgcgctcfccc 
gtccgccttt ctcccttcgg gaagcgtggc 
cagttcggtg taggtcgttc gctccaagct 
cgaccgctgc gccttatccg gtaactatcg 
atcgccactg gcagcagcca ctggtaacag 
tacagagttc ttgaagtggt ggcctaacta 
ctgcgctctg ctgaagccag ttaccttcgg 
acaaaccacc gctggtagcg gtggtttttt 
aaaaggatct caagaagatc ctttgatctt 
aaactcacgt taagggattt tggtcatgag 



-75- 

gctcgtttct tttctgtggc tgcgtgaaag 840 
cgggggggag cggctcgggg ggtgcgtgcg 900 
gcccgcgctg cccggcggct gtgagcgctg 960 
cgtgtgcgcg aggggagcgc ggccgggggc 1020 
ggaacaaagg ctgcgtgcgg ggtgtgtgcg 10 80 
cggtcgggct gtaacccccc cctgcacccc 1140 
tcgggtgcgg ggctccgtgc ggggcgtggc 1200 
cggcaggtgg gggtgccggg cggggcgggg 1260 
gggcgcggcg gccccggagc gccggcggct 1320 
ttttatggta atcgtgcgag agggcgcagg 1380 
gaaatctggg aggcgccgcc gcaccccctc 1440 
ggcaggaagg aaatgggcgg ggagggcctt 1500 
catctccagc ctcggggctg ccgcaggggg 1560 
cggggttcgg cttctggcgt gtgaccggcg 1620 
ccttcttctt tttcctacag ctcctgggca 1680 
ttggcaaaga attcatggga agaaggcgaa 1740 
tttatataag aaacaatgga tattactgct 1800 
gattaggcag agacaggcga atcgcaatca 1860 
tttcaggaca caaacacaag cctctgacag 1920 
tacattcatg gcttgatcgc tacgaaaaaa 1980 
cactcataaa ttacatgagc aaaattaaag 2 040 
ttgaagacat caccacaaaa gaaattgcgg 2100 
aggcggcgtc agccaagtta atcagatcaa 2160 
ctgaaggcca tataacaaca aaccatgtcg 2 22 0 
ggagatcaag acttacggct gacgaatacc 22 80 
catgttggct cagacttgca atggaactgg 2 340 
tatgcgaaat gaagtggtct gatatcgtag 2400 
caggcgtaaa aattgccatc ccaacagcat 24 60 
aggaaacact tgataaatgc aaagagattc 2520 
gtcgcgaacc gctttcatcc ggcacagtat 2580 
caggtctttc cttcgaaggg gatccgccta 2 640 
gactctatga gaagcagata agcgataagt 2700 
acaccatggc atcacagtat cgtgatgaca 2 760 
aataagaatt cactcctcag gtgcaggctg 2 820 
atgccctggc tcacaaatac cactgagatc 2 880 
tcatgaagcc ccttgagcat ctgacttctg 2 940 
tagtgtgttg gaattttttg tgtctctcac 3000 
taaaacatca gaatgagtat ttggtttaga 3060 
catgaacaaa ggtggctata aagaggtcat 3120 
tccttattcc atagaaaagc cttgacttga 3180 
ttattttttt ctttaacatc cctaaaattt 3240 
ctcctctcct gactactccc agtcatagct 33 00 
ctgcagccca agcttggcgt aatcatggtc 33 60 
gctcacaatt ccacacaaca tacgagccgg 3420 
atgagtgagc taactcacat taattgcgtt 3480 
cctgtcgtgc cagcggatcc gcatctcaat 3540 
ccgcccatcc cgcccctaac tccgcccagt 3 600 
atttttttta tttatgcaga ggccgaggcc 3 660 
tgaggaggct tttttggagg cctaggcttt 3 720 
ataatggtta caaataaagc aatagcatca 37 80 
tgcattctag ttgtggtttg tccaaactca 3 840 
gcattaatga atcggccaac gcgcggggag 3 900 
ttcctcgctc actgactcgc tgcgctcggt 3 960 
ctcaaaggcg gtaatacggt tatccacaga 4 020 
agcaaaaggc cagcaaaagg ccaggaaccg 4 080 
taggctccgc ccccctgacg agcatcacaa 4140 
cccgacagga ctataaagat accaggcgtt 4200 
tgttccgacc ctgccgctta ccggatacct 4260 
gctttctcaa tgctcacgct gtaggtatct 4320 
gggctgtgtg cacgaacccc ccgttcagcc 43 80 
tcttgagtcc aacccggtaa gacacgactt 444 0 
gattagcaga gcgaggtatg taggcggtgc 4 500 
cggctacact agaaggacag tatttggtat 4560 
aaaaagagtt ggtagctctt gatccggcaa 4 62 0 
tgtttgcaag cagcagatta cgcgcagaaa 4680 
ttctacgggg tctgacgctc agtggaacga 4 74 0 
attatcaaaa aggatcttca cctagatcct 4800 
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tttaaattaa 
cagttaccaa 
catagttgcc 
ccccagtgct 
aaaccagcca 
ccagtctatt 
caa.cgtt.gtt 
attcagctcc 
agcggttagc 
actcatggtt 
ttctgtgact 
ttgctcttgc 
gctcatcatt 
atccagttcg 
cagcgtttct 
gacacggaaa 
gggt tat t g t 
ggttccgcgc 



aaatgaagtt 
tgcttaatca 
tgactccccg 
gcaatgatac 
gccggaaggg 
aattgttgcc 
gccattgcta 
ggttcccaac 
tccttcggtc 
atggcagcac 
ggtgagtact 
ccggcgt caa 
ggaaaacgtt 
atgtaaccca 
gggtgagcaa 
tgttgaatac 
ctcatgagcg 
acatttcccc 



ttaaatcaat 
gtgaggcacc 
tcgtgtagat 
cgcgagaccc 
ccgagcgcag 
gggaagctag 
caggcatcgt 
gatcaaggcg 
ctccgatcgt 
tgcataattc 
caaccaagtc 
tacgggataa 
cttcggggcg 
ctcgtgcacc 
aaacaggaag 
tcatactctt 
gatacatatt 
gaaaagtgcc 



ctaaagtata 
tatctcagcg 
aactacgata 
acgctcaccg 
aagtggtcct 
agtaagtagt 
ggtgtcacgc 
agttacatga 
tgtcagaagt 
tcttactgtc 
attctgagaa 
taccgcgcca 
aaaactctca 
caactgatct 
gcaaaatgcc 
cctttttcaa 
tgaatgtatt 
acctg 



tatgagtaaa 
atctgtctat 
cgggagggct 
gctccagatt 
gcaactttat 
tcgccagtta 
tcgtcgtttg 
tcccccatgt 
aagttggccg 
atgccatccg 
tagtgtatgc 
catagcagaa 
aggatcttac 
tcagcatctt 
gcaaaaaagg 
tattattgaa 
tagaaaaata 



cttggtctga 
ttcgttcatc 
taccatctgg 
tatcagcaat 
ccgcctccat 
atagtttgcg 
gtatggcttc 
tgtgcaaaaa 
cagtgttatc 
taagatgctt 
ggcgaccgag 
ctttaaaagt 
cgctgttgag 
ttactttcac 
gaa t aagggc 
gcatttatca 
aacaaatagg 



4860 
4920 
4980 
5040 
5100 
5160 
5220 
5280 
5340 
5400 
5460 
5520 
5580 
5640 
5700 
5760 
5820 
5855 



<210> 113 
<211> 4346 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pSV4 0-19 3 AttpeensePur Plasmid 



<400> 113 

ccggtgc cgc 

atgaccgagt 

cgcaccctcg 

cgccacatcg 

atcggcaagg 

agcgtcgaag 

tcccggctgg 

cccgcgtggt 

agcgccgtcg 

gagacctccg 

gacgtcgagg 

cgcccgcccc 

cgaagc cgac 

gaggatcata 

acacctcccc 

tgcagcttat 

tttttcactg 

gatccgcgcc 

gcttggcgta 

cacacaacat 

aactcacatt 

agctgcatta 

ccgcttcctc 

ctcactcaaa 

tgtgagcaaa 

tccataggct 

gaaacccgac 

ctcctgttcc 

tggcgctttc 

agctgggctg 

atcgtcttga 

acaggattag 

actacggcta 

tcggaaaaag 

tttttgtttg 

tcttttctac 

tgagattatc 

caatctaaag 

cacctatctc 



caccatcccc 
acaagcccac 
ccgccgcgtt 
agcgggtcac 
tgtgggtcgc 
cgggggcggt 
ccgcgcagca 
tcctggccac 
tgctccccgg 
cgccccgcaa 
tgcccgaagg 
acgacccgca 
ccgggcggcc 
atcagccata 
ctgaacctga 
aatggttaca 
cattctagtt 
ggatccttaa 
atcatggtca 
acgagccgga 
aattgcgttg 
atgaatcggc 
gctcactgac 
ggcggtaata 
aggccagcaa 
ccgcccccct 
aggactataa 
gaccctgccg 
tcatagctca 
tgtgcacgaa 
gtccaacccg 
cagagcgagg 
cactagaagg 
agttggtagc 
caagcagcag 

ggggtctgac 

aaaaaggatc 
tat at at gag 
agcgatctgt 



tgacccacgc 
ggtgcgcctc 
cgccgactac 
cgagctgcaa 
ggacgacggc 
gttcgccgag 
acagatggaa 
cgtcggcgtc 
agtggaggcg 
cctccccttc 
accgcgcacc 
gcgcccgacc 
ccgccgaccc 
ccacatttgt 
aacataaaat 
aataaagcaa 
gtggtttgtc 
ttaagtctag 
tagctgtttc 
agcataaagt 
cgctcactgc 
caacgcgcgg 
tcgctgcgct 
cggttatcca 
aaggccagga 
gacgagcat c 
agat accagg 
cttaccggat 
cgctgtaggt 
ccccccgttc 
gtaagacacg 
tatgtaggcg 
acagtatttg 
tcttgatccg 
attacgcgca 
gctcagtgga 
ttcacctaga 
taaacttggt 
ctatttcgtt 



ccctgacccc 
gccacccgcg 
cccgccacgc 
gaactcttcc 
gccgcggtgg 
atcggcccgc 
ggcctcctgg 
tcgcccgacc 
gccgagcgcg 
tacgagcggc 
tggtgcatga 
gaaaggagcg 
cgcacccgcc 
agaggtttta 
gaatgcaatt 
tagcatcaca 
caaactcatc 
agtcgactgt 
ctgtgtgaaa 
gtaaagcctg 
ccgctttcca 
ggagaggcgg 
cggtcgttcg 
cagaatcagg 
accgtaaaaa 
acaaaaatcg 
cgtttccccc 
acctgtccgc 
atctcagttc 
agcccgaccg 
acttatcgcc 
gtgctacaga 
gtatctgcgc 
gcaaacaaac 
gaaaaaaagg 
acgaaaactc 
tccttttaaa 
ctgacagtta 
catccatagt 



tcacaaggag 
acgacgtccc 
gccacaccgt 
tcacgcgcgt 
cggtctggac 
gcatggccga 
cgccgcaccg 
accagggcaa 
ccggggtgcc 
tcggcttcac 
cccgcaagcc 
cacgacccca 
cccgaggccc 
cttgctttaa 
gttgttgtta 
aatttcacaa 
aatgtatctt 
ttaaacctgc 
ttgttatccg 
gggtgcctaa 
gtcgggaaac 
tttgcgtatt 
gctgcggcga 
ggataacgca 
ggccgcgttg 
acgctcaagt 
tggaagctcc 
ctttctccct 
ggtgtaggtc 
ctgcgcctta 
actggcagca 
gttcttgaag 
tctgctgaag 
caccgctggt 
atctcaagaa 
acgttaaggg 
ttaaaaatga 
ccaatgctta 
tgcctgactc 



acgaccttcc 
ccgggccgta 
cgacccggac 
cgggctcgac 
cacgccggag 
gttgagcggt 
gc c c aaggag 
gggtctgggc 
cgccttcctg 
cgtcaccgcc 
cggtgcctga 
tggctccgac 
accgactcta 
aaaacctccc 
acttgtttat 
ataaagcatt 
atcatgtctg 
aggcatgcaa 
ctcacaattc 
tgagtgagct 
ctgtcgtgcc 
gggcgctctt 
gcggtatcag 
ggaaagaaca 
ctggcgtttt 
cagaggtggc 
ctcgtgcgct 
tcgggaagcg 
gttcgctcca 
tccggtaact 
gccactggta 
tggtggccta 
ccagttacct 
agcggtggtt 
gatcctttga 
attttggtca 
agttttaaat 
atcagtgagg 
cccgtcgtgt 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 
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agataactac gatacgggag ggcttaccat 
acccacgctc accggctcca gatttatcag 
gcagaagtgg tcctgcaact ttatccgcct 
ctagagtaag tagttcgcca gttaatagtt 
tcgtggtgtc acgctcgtcg tttggtatgg 
ggcgagttac atgatccccc atgttgtgca 
tcgttgtcag aagtaagttg gccgcagtgt 
attctcttac tgtcatgcca tccgtaagat 
agtcattctg agaatagtgt atgcggcgac 
ataataccgc gccacatagc agaactttaa 
ggcgaaaact ctcaaggatc ttaccgctgt 
cacccaactg atcttcagca tcttttactt 
gaaggcaaaa tgccgcaaaa aagggaataa 
tcttcctttt tcaatattat tgaagcattt 
tatttgaatg tatttagaaa aataaacaaa 
tgccacctga cgtctaagaa accattatta 
tcacgaggcc ctttcgtctc gcgcgtttcg 
agctcccgga gacggtcaca gcttgtctgt 
agggcgcgtc agcgggtgtt ggcgggtgtc 
agattgtact gagagtgcac catatgcggt 
aataccgcat caggcgccat tcgccattca 
tgcgggcctc ttcgctatta cgccagctgg 
gttgggtaac gccagggttt tcccagtcac 
agctgtggaa tgtgtgtcag ttagggtgtg 
gtatgcaaag catgcatctc aafcfcagtcag 
cagcaggcag aagtatgcaa agcatgcatc 
taactccgcc catcccgccc ctaactccgc 
gactaatttt ttttatttat gcagaggccg 
agtagtgagg aggcttttfct ggaggctcgg 
tcactaatac catctaagta gttgattcat 
tatgtagtct gttttttatg caaaatctaa 
gtttctcgtt cagctttttt atactaagtt 
tgttgcaacg aacaggtcac tatcagtcaa 
cccactccct gcctctgggg ggcgcg 

<210> 114 
<211> 3166 
<212> DNA 

<213> Artificial Sequence 



ctggccccag tgctgcaatg ataccgcgag 24 00 
caataaacca gccagccgga agggccgagc 24 60 
ccatccagtc tattaattgt tgccgggaag 2520 
tgcgcaacgt tgttgccatt gctacaggca 25 80 
cttcattcag ctccggttcc caacgatcaa 2 640 
aaaaagcggt tagctccttc ggtcctccga 27 00 
tatcactcat ggttatggca gcactgcata 2 760 
gcttttctgt gactggtgag tactcaacca 2 82 0 
cgagttgctc ttgcccggcg tcaatacggg 2 8 80 
aagtgctcat cattggaaaa cgttcttcgg 2 940 
tgagatccag ttcgatgtaa cccactcgtg 3 0 00 
tcaccagcgt ttctgggtga gcaaaaacag 3 0 6O 
gggcgacacg gaaatgttga atactcatac 312 0 
atcagggtta ttgtctcatg agcggataca 3180 
taggggttcc gcgcacattt ccccgaaaag 3 24 0 
tcatgacatt aacctataaa aataggcgta 33 0 0 
gtgatgacgg tgaaaacctc tgacacatgc 33 60 
aagcggatgc cgggagcaga caagcccgtc 342 0 
ggggctggct taactatgcg gcatcagagc 34 8 0 
gtgaaatacc gcacagatgc gtaaggagaa 354 0 
ggctgcgcaa ctgttgggaa gggcgatcgg 360 0 
cgaaaggggg atgtgctgca aggcgattaa 3 660 
gacgttgtaa aacgacggcc agtgaattcg 3 72 0 
gaaagtcccc aggctcccca gcaggcagaa 3780 
caaccaggtg tggaaagtcc ccaggctccc 3 84 0 
tcaattagtc agcaaccata gtcccgcccc 3 90 0 
ccagttccgc ccattctccg ccccatggct 3960 
aggccgcctc ggcctctgag ctattccaga 4 02 0 
tacccccttg cgctaatgct ctgttacagg 4080 
agtgactgca tatgttgtgt tttacagtat 414 0 
tttaatatat tgatatttat atcattttac 42 0 0 
ggcattataa aaaagcattg cttatcaatt 42 6 0 
aataaaatca ttatttgatt tcaattttgt 432 0 

4346 



<220> 

<223> pl8attBZeo Plasmid 



<400> 114 

cagttgccgg 

gtcatggccg 

tacagctcgt 

tcctggaccg 

tccacgaagt 

tcgcgcgcgg 

caagttagta 

actctagagg 

gtgtgaaatt 

aaagcctggg 

gctttccagt 

agaggcggtfc 

gtcgttcggc 

gaatcagggg 

cgt aaaaagg 

aaaaatcgac 

tttccccctg 

ctgtccgcct 

ctcagttcgg 

cccgaccgct 

ttatcgccac 

gctacagagt 

atctgcgctc 



ccgggt cgcg 
gc c cggaggc 
ccaggccgcg 
cgctgatgaa 
cccgggagaa 
tgagcaccgg 
taaaaaagca 
at cc ccgggt 
gttatccgct 
gtgcctaatg 
cgggaaacct 
tgcgtattgg 
tgcggcgagc 
ataacgcagg 
ccgcgttgct 
gctcaagtca 
gaagctccct 
ttctcccttc 
tgtaggtcgt 
gcgccttatc 
tggcagc age 
tcttgaagtg 
tgetgaagee 



cagggegaac 
gtcccggaag 
cacccacacc 
cagggtcacg 
cccgagccgg 
aacggcactg 
ggcttcaatc 
accgagctcg 
cacaattcca 
agtgagctaa 
gtcgtgccag 
gcgctcttcc 
ggtatcagct 
aaagaacatg 
ggcgtttttc 
gaggtggcga 
cgtgcgctct 
gggaagcgtg 
tcgctccaag 
eggtaactat 
cactggtaac 
gtggcctaac 
agttaccttc 



tcccgccccc 
ttcgtggaca 
caggecaggg 
tcgtcccgga 
teggtccaga 
gtcaacttgg 
ctgcagagaa 
aattcgtaat 
cacaacatac 
ctcacattaa 
ctgcattaat 
gcttcctcgc 
cactcaaagg 
tgagcaaaag 
cataggctcc 
aacccgacag 
cctgttccga 
gcgctttctc 
ctgggctgtg 
cgtcttgagt 
aggattagca 
tacggctaca 
ggaaaaagag 



acggctgctc 
cgacctccga 
tgttgtccgg 
ccacaccggc 
actcgaccgc 
ccatggatcc 
gettgeatge 
catggtcata 
gagceggaag 
ttgcgttgcg 
gaateggeca 
tcactgactc 
eggtaatacg 
gecagcaaaa 
gcccccctga 
gactataaag 
ccctgccgct 
atagctcacg 
tgcacgaacc 
ccaacccggt 
gagegaggta 
ctagaaggac 
ttggtagctc 



gccgatctcg 
ccactcggcg 
caccacctgg 
gaagtegtec 
tccggcgacg 
agatttcget 
ctgeaggteg 
gctgtttcct 
cataaagtgt 
ctcactgccc 
aegegegggg 
gctgcgctcg 
gttatccaca 
ggecaggaac 
cgagcatcac 
ataccaggcg 
t aceggat ac 
ctgtaggtat 
ccccgttcag 
aagacacgac 
tgtaggcggt 
agtatttggt 
ttgatcegge 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

120O 

1260 

1320 

1380 
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aaacaaacca 
aaaaaaggat 
gaaaactcac 
cttttaaatt 
gacagttacc 
tccatagttg 
ggccccagtg 
ataaaccagc 
atccagtcta 
cgcaacgttg 
tcafctcagct 
aaagcggfc t a 
tcactcatgg 
ttttctgtga 
agttgctctt 
gtgctcatca 
agatccagtt 
accagcgttt 
gcgacacgga 
cagggttatt 

ggggttccgc 

gccgaagcgg 
gggaagggcg 
ctgcaaggcg 
cggccagtcc 
atgataagat 
tttatttgtg 
caagttgggg 
ggcgtcccgg 
atctcgtagc 



ccgctggtag 
ctcaagaaga 
gttaagggat 
aaaaatgaag 
aatgcttaat 
cctgactccc 
ctgcaatgat 
cagccggaag 
ttaattgttg 
ttgccattgc 
ccggttccca 
gctccttcgg 
ttatggcagc 
ctggtgagta 
gcccggcgtc 
ttggaaaacg 
cgatgtaacc 
ctgggtgagc 
aatgttgaat 
gfcctcatgag 
gcacatttcc 
gctttattac 
at cggt gcgg 
attaagttgg 
gtaatacgac 
acattgatga 
aaatttgtga 
tgggcgaaga 
aaaacgattc 
acgtgtcagt 



cggtggfcttt 
tcctttgatc 
tttggtcatg 
ttttaaatca 
cagtgaggca 
cgtcgtgtag 
accgcgagac 
ggccgagcgc 
ccgggaagct 
tacaggcatc 
acgatcaagg 
tcctccgatc 
actgcataat 
ctcaaccaag 
aatacgggafc 
ttcttcgggg 
cactcgtgca 
aaaaacagga 
acfccatactc 
cggatacata 
ccgaaaagtg 
caagcgaagc 
gcctcttcgc 
gtaacgccag 
tcacttaagg 
gtttggacaa 
tgctattgct 
actccagcat 
cgaagcccaa 
cctgctcctc 



tttgtttgca 
ttttctacgg 
agattatcaa 
atctaaagfca 
cctatctcag 
ataactacga 
ccacgctcac 
agaagtggtc 
agagtaagta 
gtggtgtcac 
cgagttacat 
gttgtcagaa 
tctcttactg 
tcattctgag 
aataccgcgc 
cgaaaactct 
cccaactgat 
aggcaaaatg 
ttcctttttc 
tttgaatgta 
ccacctgacg 
gccattcgcc 
tattacgcca 
ggttttccca 
ccttgactag 
accacaacta 
ttatttgtaa 
gagatccccg 
cctttcatag 
ggccacgaag 



agcagcagat 
ggtctgacgc 
aaaggatctt 
tatatgagta 
cgatctgtct 
tacgggaggg 
cggctccaga 
ctgcaacttt 
gttcgccagt 
gctcgtcgtt 
gatcccccat 
gtaagttggc 
tcatgccatc 
aatagtgtat 
cacatagcag 
caaggatctt 
cttcagcatc 
ccgcaaaaaa 
aatattattg 
tttagaaaaa 
tagttaacaa 
attcaggctg 
gctggcgaaa 
gtcacgacgt 
agggtcgacg 
gaatgcagtg 
ccattataag 
cgctggagga 
aaggcggcgg 
tgcacg 



tacgcgcaga 
tcagtggaac 
cacctagatc 
aacttggtct 
atttcgttca 
cttaccatct 
tttatcagca 
atccgcctcc 
taatagtttg 
t'ggtatggct 
gttgtgcaaa 
cgcagtgtta 
cgtaagatgc 
gcggcgaccg 
aactttaaaa 
accgctgttg 
ttttactttc 
gggaataagg 
aagcatttat 
taaacaaata 
aaaaaagccc 
cgcaactgtt 

gggggatgtg 

tgtaaaacga 
gtatacagac 
aaaaaaatgc 
ctgcaataaa 
tcatccagcc 
tggaatcgaa 



1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3166 



<210> 115 

<211> 7600 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> p!8attBZeo3' 6XHS4eGFP Plastnid 



<400> 115 

cagttgccgg 

gt catggccg 

tacagctcgt 

tcctggaccg 

tccacgaagt 

tcgcgcgcgg 

caagttagta 

gtaatcaatt 

tacggtaaat 

gacgtatgtt 

tttacggtaa 

tattgacgtc 

ggactttcct 

tgagccccac 

atttatttat 

ccaggcgggg 

gccaatcaga 

ccctataaaa 

ccgctccgcg 

gtgagcgggc 

ctcgtttctt 

gggggggagc 

cccgcgctgc 
gtgtgcgcga 
gaacaaaggc 
ggtcgggctg 
cgggtgcggg 



ccgggtcgcg 
gcccggaggc 
ccaggccgcg 
cgctgatgaa 
cccgggagaa 
tgagcaccgg 
taaaaaagca 
acggggtcat 
ggcccgcctg 
cccatagtaa 
actgcccact 
aatgacggta 
acttggcagt 
gttctgcttc 
tttttaatta 
cggggcgggg 
gcggcgcgct 
agcgaagcgc 
ccgcctcgcg 
gggacggccc 
ttctgtggct 
ggctcggggg 
ccggcggctg 

ggggagcgcg 

tgcgtgcggg 
taaccccccc 
gctccgtgcg 



cagggcgaac 
gtcccggaag 
cacccacacc 
cagggtcacg 
cccgagccgg 
aacggcactg 
ggcttcaatc 
tagttcatag 
gctgaccgcc 
cgccaatagg 
tggcagtaca 
aatggcccgc 
acatctacgt 
actctcccca 
ttttgtgcag 
cgaggggcgg 
ccgaaagttt 
gcggcgggcg 
ccgcccgccc 
ttctcctccg 
gcgtgaaagc 
gtgcgtgcgt 
tgagcgctgc 
gccgggggcg 
gtgtgtgcgt 
ctgcaccccc 
gggcgtggcg 



tcccgccccc 
ttcgtggaca 
c aggcc aggg 
tcgtcccgga 
tcggtccaga 
gtcaacttgg 
ctgcagagaa 
cccatatatg 
caacgacccc 
gactttccat 
tcaagtgtat 
ctggcattat 
attagtcatc 
tctccccccc 
cgatgggggc 
ggcggggcga 
ccttttatgg 
ggagtcgctg 
cggctctgac 
ggctgtaatt 
cttaaagggc 
gtgtgtgtgc 
gggcgcggcg 
gtgccccgcg 
gggggggtga 
ctccccgagt 
cggggctcgc 



acggctgctc 
cgacctccga 
tgttgtccgg 
ccacaccggc 
actcgaccgc 
ccatggatcc 
gcttgatcta 
gagttccgcg 
cgcccattga 
tgacgtcaat 
catatgccaa 
gcccagtaca 
gctattacca 
ctccccaccc 

gggggggggs 

ggcggagagg 
cgaggcggcg 
cgttgccttc 
tgaccgcgtt 
agcgcttggt 
tccgggaggg 
gtggggagcg 
cggggctttg 
gtgcgggggg 
gcagggggtg 
tgctgagcac 
cgtgccgggc 



gccgatctcg 
ccactcggcg 
caccacctgg 
gaagt cgtcc 
t c cggcgacg 
agatttcgct 
gttattaata 
ttacataact 
cgtcaataat 
gggtggacta 

gtacgccccc 
tgaccttatg 
tgggtcgagg 
ccaattttgt 
ggggcgcgcg 
tgcggcggca 
gcggcggcgg 
gccccgtgcc 
actcccacag 
ttaatgacgg 
ccctttgtgc 
ccgcgtgcgg 
tgcgctccgc 
gctgcgaggg 
tgggcgcggc 
ggcccggctt 

ggggggtggc 



60 

12 0 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

114 0 

12 0 0 

1260 

1320 

1380 

1440 

1500 

1560 

1620 



WO 02/097059 
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-79- 



ggcaggtggg 
ggcgcggcgg 
tttatggtaa 
aaatctggga 
gcaggaagga 
atctccagcc 
ggggttcggc 
cttcttcttt 
tggcaaagaa 
catcctggtc 
cgagggcgat 
gcccgtgccc 
ctaccccgac 
ccaggagcgc 
gttcgagggc 
cggcaacatc 
ggccgacaag 
cggcagcgtg 
gctgctgccc 
gaagcgcgat 
ggacgagctg 
ggctggtgtg 
aaat t a tggg 
tattttcatt 
gagggcaaat 
catatgctgg 
cccctgctgt 
tattttgttt 
tagccagatt 
gaagatccct 
ccgccccgta 
agcgatcccg 
tgcgggggga 
gggagggacg 
cgcggccccg 
aaagcgatcc 
gatgcggggg 
gggggaggga 
tccgcggccc 
ggaaagcgat 
gggatgcggg 
gcgggggagg 
gatccgcggc 
gaggaaagcg 
cggggatgcg 
tagcggggga 
cggatccgcg 
cagaggaaag 
ctcggggatg 
cctagcgggg 
agcggatccg 
ttcagaggaa 
ggctcgggga 
cccctagcgg 
tgagcggatc 
aattgttatc 

tggggtgcct 

cagtcgggaa 
ggtttgcgta 
cggctgcggc 
ggggataacg 
aaggccgcgt 
cgacgctcaa 
cctggaagct 
gcctttctcc 
tcggtgtagg 
cgctgcgcct 



ggtgccgggc 
ccccggagcg 
tcgtgcgaga 
ggcgccgccg 
aatgggcggg 
tcggggctgc 
ttctggcgtg 
ttcctacagc 
ttcgccacca 
gagctggacg 
gccacctacg 
tggcccaccc 
cacatgaagc 
accatcttct 
gacaccctgg 
ctggggcaca 
c agaagaacg 
cagctcgccg 
gacaaccact 
cacatggtcc 
tacaagtaag 
gccaatgccc 
gacatcatga 
gcaatagtgt 
catttaaaac 
ctgccatgaa 
ccattcctta 
tgtgttattt 
tttcctcctc 
cgacctgcag 
tcccccaggt 
tgccaccttc 
gcgccggacc 
taattacatc 
tatcccccag 
cgtgccacct 
gagcgccgga 
cgtaattaca 
cgtatccccc 
cccgtgccac 
gggagcgccg 

gacgtaatta 
cccgtatccc 
atcccgtgcc 

gggggagcgc 

gggacgtaat 
gccccgtatc 
cgatcccgtg 
cggggggagc 
gagggacgta 
cggccccgta 
agcgat cccg 
tgcgggggga 
gggagggacg 
cgcggggctg 
cgctcacaat 
aatgagtgag 
acctgtcgtg 
ttgggcgctc 
gagcggtatc 
caggaaagaa 
tgctggcgtt 
gtcagaggtg 
ccctcgtgeg 
cttcgggaag 
tcgttcgctc 
tatccggtaa 



ggggcggggc 
ccggcggctg 
gggcgcaggg 
caccccctct 
gagggccttc 
cgcaggggga 
tgaccggcgg 
tcctgggcaa 
tggtgagcaa 
gcgacgtaaa 
gcaagctgac 
tcgtgaccac 
agcacgactt 
tcaaggacga 
tgaaccgcat 
agctggagta 
gcatcaaggt 
accactacca 
acctgagcac 
tgctggagtt 
aattcactcc 
tggctcacaa 
agccccttga 
gttggaattt 
atcagaatga 
caaaggtggc 
ttccatagaa 
ttttctttaa 
tcctgactac 
cccaagcttg 
gtctgcaggc 
cccgtgcccg 
ggagcggagc 
cctgggggct 
gtgtctgcag 
tccccgtgcc 
ccggagcgga 
tccctggggg 
aggtgtctgc 
cttccccgtg 
gaccggagcg 
cat ccc tggg 
ccaggtgtct 
accttccccg 
cggaccggag 
tacatccctg 
ccccaggtgt 
ccaccttccc 
gccggaccgg 
attacatccc 
tcccccaggt 
tgccaccttc 
gcgccggacc 
taattacatc 
caggaattcg 
tccacacaac 
ctaactcaca 
ccagctgcat 
ttccgcttcc 
agctcactca 
catgtgagca 
tttccatagg 
gcgaaacccg 
ctctcctgtt 
cgtggcgctt 
caagctgggc 
ctatcgtctt 



cgcctcgggc 
t cgaggcgcg 
acttcctttg 
agcgggcgcg 
gtgcgtcgcc 
cggctgcctt 
ctctagagcc 
cgtgctggtt 
gggcgaggag 
cggccacaag 
cctgaagttc 
cctgacctac 
cttcaagtcc 
cggcaactac 
cgagctgaag 
caactacaac 
gaacttcaag 
gcagaacacc 
ccagtccgcc 
cgtgaccgcc 
tcaggtgcag 
ataccactga 
gcatctgact 
tttgtgtctc 
gtatttggtt 
tataaagagg 
aagccttgac 
catccctaaa 
tcccagtcat 
catgcctgca 
t caaagagca 
ggctgtcccc 
cccgggcggc 
ttgggggggg 
gctcaaagag 
cgggctgtcc 
gccccgggcg 
ctttgggggg 
aggctcaaag 
cccgggctgt 
gagccccggg 
ggctttgggg 
gcaggctcaa 
tgcccgggct 
cggagccccg 

ggggctttgg 

ctgcaggctc 
cgtgcccggg 
agcggagccc 
tgggggcttt 
gtctgcaggc 
cccgtgcccg 
ggagcggagc 
cctgggggct 
taatcatggt 
atacgagccg 
ttaattgcgt 
taatgaatcg 
tcgctcactg 
aaggcggtaa 
aaaggccagc 
ctccgccccc 
acaggactat 
ccgaccctgc 
tctcatagct 
tgtgtgcacg 
gagtccaacc 



cggggagggc 
gcgagccgca 
tcccaaatct 
ggcgaagcgg 
gcgccgccgt 
cgggggggac 
tctgctaacc 
gttgtgctgt 
ctgttcaccg 
ttcagcgtgt 
atctgcacca 
ggcgtgcagt 
gccatgcccg 
aagacccgcg 
ggcatcgact 
agccacaacg 
atccgccaca 
cccatcggcg 
c t gage aaag 
geegggatea 
gctgcctatc 
gatctttttc 
tctggctaat 
teacteggaa 
t agagt t tgg 
tcatcagtat 
ttgaggttag 
attttcctta 
agctgtccct 
ggtcgactct 
gcgagaagcg 
gcacgctgcc 
tcgctgctgc 
gctgtccccg 
cagegagaag 
ccgcacgctg 
gctcgctgct 
gggctgtccc 
ageagegaga 
ccccgcacgc 
cggctcgctg 

gggggctgtc 

agagcagega 
gtccccgcac 
ggegge t cgc 

gggggggctg 

aaag age age 
ctgtccccgc 
cgggcggctc 

gggggggggc 

tcaaagagca 
ggctgtcccc 
cccgggcggc 
ttgggggggg 
catagctgtt 
gaagcataaa 
tgcgctcact 
gccaacgcgc 
actcgctgcg 
tacggttatc 
aaaaggccag 
c t ga cgagca 
aaagatacca 
cgcttaccgg 
caegctgtag 
aaccccccgt 
eggtaagaca 



tegggggagg 
gccattgcct 
ggcggagccg 
tgcggcgccg 
ccccttctcc 
ggggcagggc 
atgttcatgc 
ctcatcattt 
gggtggtgcc 
ccggcgaggg 
ccggcaagct 
gcttcagccg 
aaggctacgt 
ccgaggtgaa 
tcaaggagga 
tctatatcat 
acatcgagga 
acggccccgt 
accccaacga 
ctctcggcat 
agaaggtggt 
cctctgccaa 
aaaggaaa 1 1 
ggacatatgg 
caacatatgc 
atgaaacagc 
atttttttta 
catgttttac 
cttctcttat 
agtggatccc 
ttcagaggaa 
ggctcgggga 
cccctagcgg 
tgagcggatc 
cgttcagagg 
ccggctcggg 
gccccctagc 
cgtgagcgga 
agegttcaga 
tgccggctcg 
ctgcccccta 
cccgtgagcg 
gaagcgttca 
gctgccggct 
tgctgccccc 
tccccgtgag 
gagaagegtt 
acgctgccgg 
gctgctgccc 
tgtccccgtg 
gcgagaagcg 
gcacgctgcc 
tcgctgctgc 
gctgtccccg 
tcctgtgtga 
gtgtaaagcc 
gcccgctttc 
ggggagaggc 
cteggtegtt 
cacagaatca 
gaaccgtaaa 
tcacaaaaat 
ggcgtttccc 
atacctgtcc 
gtatctcagt 
tcagcccgac 
cgacttatcg 



1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
390O 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 
5100 
5160 
5220 
5280 
5340 
5400 
5460 
5520 
5580 
5640 
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ccactggcag 
gagtfcctfcga 
gctctgcfcga 
accaccgctg 
ggatctcaag 
tcacgttaag 
aattaaaaat 
taccaatgct 
gttgcctgac 
agtgctgcaa 
cagccagccg 
tctattaatt 
gttgttgcca 
agctccggtt 
gttagctcct 
atggttatgg 
gtgactggtg 
tcttgcccgg 
atcattggaa 
agttcgatgt 
gttfcctgggt 
cggaaatgtt 
tattgtctca 
ccgcgcacat 
gcgggcttta 
ggcgafccggt 
ggcgatfcaag 
gtccgtaata 
agatacattg 
tgtgaaattt 

ggggtgggcg 

ccggaaaacg 
tagcacgtgt 



cagccactgg 
agtggtggcc 
agccagt t ac 
gtagcggtgg 
aagatccttt 
ggafctttggt 
gaagttttaa 
taatcagtga 
tccccgtcgt 
tgataccgcg 
gaagggccga 
gttgccggga 
ttgctacagg 
cccaacgatc 
tcggtcctcc 
cagcactgca 
agtactcaac 
cgtcaatacg 
aacgtfccttc 
aacccactcg 
gagcaaaaac 
gaatactcat 
tgagcggata 
ttccccgaaa 
ttaccaagcg 
gcgggcctct 
ttgggtaacg 
cgactcactt 
atgagtttgg 
gtgatgctat 
aagaactcca 
attccgaagc 
cagtcctgct 



taacaggatt 
taactacggc 
cttcggaaaa 
tttttttgtt 
gatcttttct 
catgagatta 
atcaatctaa 
ggcacctatc 
gtagataact 
agacccacgc 
gcgcagaagt 
agctagagta 
catcgtggtg 
aaggcgagtt 
gatcgttgtc 
taattctctt 
caagtcattc 
ggataatacc 
ggggcgaaaa 
tgcacccaac 
aggaaggcaa 
actcttcctt 
catatttgaa 
agtgccacct 
aagcgccatt 
tcgctattac 
ccagggtfctt 
aaggccttga 
acaaaccaca 
tgctttattt 
gcatgagatc 
ccaacctttc 
cctcggccac 



-80- 



agcagagcga 
tacactagaa 
agagfctggfca 
tgcaagcagc 
acggggtctg 
tcaaaaagga 
agtatatatg 
tcagcgatct 
acgatacggg 
tcaccggctc 
ggtcctgcaa 
agtagttcgc 
tcacgctcgt 
acatgatccc 
agaagtaagt 
actgtcatgc 
tgagaatagt 
gcgccacata 
ctctcaagga 
tgatcttcag 
aatgccgcaa 
tttcaatatt 
tgtatttaga 
gacgtagfcta 
cgccattcag 
gccagctggc 
cccagtcacg 
ctagagggtc 
actagaatgc 
gtaaccatta 
cccgcgctgg 
atagaaggcg 
gaagtgcacg 



ggtatgtagg 
ggacagtatt 
gctcttgatc 
agattacgcg 
acgctcagtg 
tcttcaccta 
agtaaacttg 
gtctatttcg 
agggcttacc 
cagatttatc 
ctttatccgc 
cagttaatag 
cgtttggtat 
ccatgfctgtg 
tggccgcagt 
catccgtaag 
gtatgcggcg 
gcagaacttt 
tcttaccgct 
catcttttac 
aaaagggaat 
attgaagcat 
aaaataaaca 
acaaaaaaaa 
gctgcgcaac 
gaaaggggga 
acgttgtaaa 
gacggtatac 
agtgaaaaaa 
taagctgcaa 
aggatcatcc 
gcggtggaat 



cggtgctaca 
tggtatctgc 
cggcaaacaa 
cagaaaaaaa 
gaacgaaaac 
gatcctttta 
gtctgacagt 
ttcatccata 
atctggcccc 
agcaataaac 
ctccatccag 
tttgcgcaac 
ggcttcattc 
caaaaaagcg 
gttatcactc 
atgcttttct 
accgagttgc 
aaaagtgctc 
gttgagatcc 
tttcaccagc 
aagggcgaca 
ttatcagggt 
aataggggtt 
gcccgccgaa 
tgttgggaag 
tgtgctgcaa 
acgacggcca 
agacatgata 
atgctttatt 
taaacaagtt 
agccggcgtc 
cgaaatctcg 



5700 
5760 
5820 
5880 
5940 
6000 
6060 
6120 
6180 
6240 
6300 
6360 
6420 
6480 
6540 
6600 
6660 
6720 
6780 
6840 
6900 
6960 
7020 
7080 
7140 
7200 
7260 
7320 
7380 
7440 
7500 
7560 
7600 



<210> 116 
<211> 7631 
<212> DNA 

<213> Artificial Sequence 



<22Q> 

<223> plBattBZeoS' 6XHS4eGFP Plasmid 



<400> 116 

cagfctgccgg 

gtcatggccg 

tacagctcgt 

tcctggaccg 

tccacgaagt 

tcgcgcgcgg 

caagttagta 

agccccgcgg 

gtccctcccc 

ccccccgcat 

gggatcgctt 

acggggccgc 

acgtccctcc 

ctccccccgc 

acgggatcgc 

atacggggcc 

ttacgtccct 

cgctcccccc 

gcacgggatc 

ggatacgggg 

aattacgtcc 

ggcgctcccc 

tggcacggga 

ggggatacgg 



ccgggtcgcg 
gcccggaggc 
ccaggccgcg 
cgctgatgaa 
cccgggagaa 
tgagcaccgg 
taaaaaagca 
atccgctcac 
cgctaggggg 
ccccgagccg 
tcctctgaac 
ggatccgctc 
cccgctaggg 
atccccgagc 
tttcctctga 
gcggatccgc 
cccccgctag 
gcatccccga 
gctttcctct 
ccgcggatcc 
ctcccccgct 
ccgcatcccc 
tcgctttcct 
ggccgcggat 



cagggcgaac 
gtcccggaag 
cacccacacc 
cagggtcacg 
cccgagccgg 
aacggcactg 
ggcttcaatc 
ggggacagcc 
cagcagcgag 
gcagcgtgcg 
gcttctcgct 
acggggac ag 
ggcagcagcg 
cggcagcgtg 
acgcttctcg 
tcacggggac 
ggggcagcag 
gccggcagcg 
gaacgcttct 
gctcacgggg 
agggggcagc 
gagccggcag 
ctgaacgctt 
ccgctcacgg 



tcccgccccc 
ttcgtggaca 
caggccaggg 
bcgtcccgga 
tcggtccaga 
gtcaacttgg 
ctgcagagaa 
cccccccaaa 
ccgcccgggg 
gggacagccc 
gctctttgag 
ccccccccca 
agccgcccgg 
cggggacagc 
ctgctctttg 
agcccccccc 
cgagccgccc 
tgcggggaca 
cgctgctctt 
acagcccccc 
agcgagccgc 
cgtgcgggga 
ctcgctgctc 
ggacagcccc 



acggctgctc 
cgacc t ccga 
tgttgtccgg 
ccacaccggc 
actcgaccgc 
ccatggatcc 
gcttgatatc 
gcccccaggg 
ctccgctccg 
gggcacgggg 
cc t gcagaca 
aagcccccag 
ggctccgctc 
ccgggcacgg 
agcctgcaga 
caaagccccc 

ggggctccgc 

gcccgggcac 
tgagcctgca 
cccaaagccc 
ccggggctcc 
cagcccgggc 
tttgagcctg 
cccccaaagc 



gccgatctcg 
ccactcggcg 
caccacctgg 
gaagtcgtcc 
tccggcgacg 
agatttcgct 
gaattcctgc 
atgtaattac 
gtccggcgct 
aaggtggcac 
cctgggggat 
ggatgtaatt 
cggtccggcg 
ggaaggtggc 
cacctggggg 
agggatgtaa 
tccggtccgg 
ggggaaggtg 
gacacctggg 
ccagggatgt 
gctccggtcc 
acggggaagg 
cagacacctg 
ccccagggat 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 
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gtaattacgt ccctcccccg ctagggggca 
ccggcgctcc ccccgcatcc ccgagccggc 
ggtgscacgg gatcgctttc ctctgaacgc 
tgggggatac ggggccgcgg atccgctcac 
atgtaattac gtccctcccc cgctaggggg 
gtccggcgct ccccccgcat ccccgagccg 
aaggtggcac gggatcgctt tcctctgaac 
cctgggggat acggggcggg ggafcccacta 
tagttcatag cccatatatg gagfctccgcg 
gctgaccgcc caacgacccc cgcccattga 
cgccaatagg gactttccat tgacgtcaat 
tggcagtaca tcaagtgtat catatgccaa 
aatggcccgc ctggcattat gcccagtaca 
acatctacgt attagtcatc gctattacca 
actctcccca tctccccccc ctccccaccc 
ttttgtgcag cgatgggggc gggggggggg 
cgaggggcgg ggcggggcga ggcggagagg 
ccgaaagttt ccttttatgg cgaggcggcg 
gcggcgggcg ggagtcgcfcg cgttgccttc 
ccgcccgccc cggctctgac tgaccgcgtt 
ttctcctccg ggctgtaafct agcgcttggt 
gcgt gaaagc cfc 1 aaagggc t c cgggaggg 
gtgcgtgcgt gtgtgtgtgc gtggggagcg 
tgagcgctgc gggcgcggcg cggggctttg 
gccgggggcg gtgccccgcg gtgcgggggg 
gtgtgtgcgt gggggggtga gcagggggtg 
ctgcaccccc ctccccgagt tgctgagcac 
9gg c gtggcg cggggctcgc cgtgccgggc 
ggggcggggc cgcctcgggc cggggagggc 
ccggcggctg tcgaggcgcg gcgagccgca 
9ggcgcaggg acttcctttg tcccaaatct 
caccccctct agcgggcgcg ggcgaagcgg 
gagggccttc gtgcgtcgcc gcgccgccgt 
cgcaggggga cggctgcctt cgggggggac 
tgaccggcgg ctctagagcc tctgctaacc 
tcctgggcaa cgtgctggtt gttgtgctgt 
tggtgagcaa gggcgaggag ctgttcaccg 
gcgacgtaaa cggccacaag ttcagcgtgt 
gcaagctgac cctgaagttc atctgcacca 
tcgtgaccac cctgacctac ggcgtgcagt 
agcacgactt cttcaagtcc gccatgcccg 
tcaaggacga cggcaactac aagacccgcg 
tgaaccgcat cgagctgaag ggcatcgact 
agctggagta caactacaac agccacaacg 
gcatcaaggt gaacttcaag atccgccaca 
accactacca gcagaacacc cccatcggcg 
acctgagcac ccagtccgcc ctgagcaaag 
tgctggagtt cgtgaccgcc gccgggatca 
aattcactcc tcaggtgcag gctgcctatc 
tggctcacaa ataccactga gatctttttc 
agccccttga gcatctgact tctggctaat 
gttggaattt tttgtgtctc tcactcggaa 
atcagaatga gtatttggtt tagagttfcgg 
caaaggtggc tataaagagg tcatcagtat 
ttccatagaa aagccttgac ttgaggttag 
ttttctttaa catccctaaa attttcctta 
tcctgactac tcccagtcat agctgtccct 
cccaagcttg catgcctgca ggtcgactct 
gtaatcatgg tcatagctgt ttcctgtgtg 
catacgagcc ggaagcataa agtgtaaagc 
attaattgcg ttgcgctcac tgcccgcttt 
ttaatgaatc ggccaacgcg cggggagagg 
ctcgctcact gactcgctgc gctcggtcgt 
aaaggcggta atacggttat ccacagaatc 
aaaaggccag caaaaggcca ggaaccgtaa 
gctccgcccc cctgacgagc atcacaaaaa 
gacaggacta taaagatacc aggcgtttcc 



-81- 

gcagcgagcc gcccggggct ccgctccggt 150 0 
agcgtgcggg gacagcccgg gcacggggaa 1560 
ttctcgctgc tctttgagcc tgcagacacc 1620 
ggggacagcc cccccccaaa gcccccaggg 1680 
cagcagcgag ccgcccgggg ctccgctccg 1740 
gcagcgtgcg gggacagccc gggcacgggg 1800 
gcttctcgct gctcfcttgag cctgcagaca 1860 
gttattaata gtaatcaatt acggggtcat 1920 
ttacataact bacggtaaat ggcccgcctg 1980 
cgtcaataat gacgtatgtt cccatagtaa 2 040 
gggtggacta tttacggtaa actgcccact 2100 
gtacgccccc tattgacgtc aatgacggta 2160 
tgaccttatg ggactttcct acttggcagt 2220 
tgggtcgagg tgagccccac gttctgcttc 2280 
ccaattttgt atttatttat tttttaatta 2340 
ggggcgcgcg ccaggcgggg cggggcgggg 240 0 
tgcggcggca gccaatcaga gcggcgcgct 2460 
gcggcggcgg ccctataaaa agcgaagcgc 2520 
gccccgtgcc ccgctccgcg ccgcctcgcg 2580 
actcccacag gtgagcgggc gggacggccc 2640 
ttaatgacgg ctcgtttctt ttctgtggct 2700 
ccctttgtgc gggggggagc ggctcggggg 2760 
ccgcgtgcgg cccgcgctgc ccggcggctg 2 820 
tgcgctccgc gtgtgcgcga ggggagcgcg 2 880 
gctgcgaggg gaacaaaggc tgcgtgcggg 2 94 0 
tgggcgcggc ggtcgggctg taaccccccc 3 00 0 
ggcccggctt cgggtgcggg gctccgtgcg 3060 
ggggggfcggc ggcaggfcggg ggtgccgggc 3120 
tcgggggagg ggcgcggcgg ccccggagcg 3180 
gccattgcct tttatggtaa tcgtgcgaga 3 240 
ggcggagccg aaatctggga ggcgccgccg 33 00 
tgcggcgccg gcaggaagga aatgggcggg 33 60 
ccccttctcc atctccagcc tcggggctgc 342 0 
ggggcagggc ggggttcggc ttctggcgtg 3480 
atgttcatgc cttcttcttt ttcctacagc 3540 
ctcatcattt tggcaaagaa ttcgccacca 3 600 
gggtggtgcc catcctggtc gagctggacg 3660 
ccggcgaggg cgagggcgat gccacctacg 3720 
ccggcaagct gcccgtgccc tggcccaccc 3780 
gcttcagccg ctaccccgac cacatgaagc 3840 
aaggctacgt ccaggagcgc accatcttct 3900 
ccgaggtgaa gttcgagggc gacaccctgg 3960 
tcaaggagga cggcaacatc ctggggcaca 4020 
tctatatcat ggccgacaag cagaagaacg 4080 
acatcgagga cggcagcgtg cagctcgccg 4140 
acggccccgt gctgctgccc gacaaccact 42 0 0 
accccaacga gaagcgcgat cacatggtcc 42 60 
ctctcggcat ggacgagctg tacaagtaag 4320 
agaaggtggt ggctggtgtg gccaatgccc 43 80 
cctctgccaa aaattatggg gacatcatga 4440 
aaaggaaatt tattttcatt gcaatagtgt 4500 
ggacatatgg gagggcaaat catttaaaac 4560 
caacatatgc catatgctgg ctgccatgaa 462 0 
atgaaacagc cccctgctgt ccattcctta 4680 
atttfctttta tattttgttt tgtgttattt 4740 
catgttttac tagccagatt tttcctcctc 4800 
cttctcttat gaagatccct cgacctgcag 4860 
agaggatccc cgggtaccga gctcgaattc 4920 
aaafctgttafc ccgctcacaa ttccacacaa 4980 
ctggggtgcc taatgagtga gctaactcac 5040 
ccagtcggga aacctgtcgt gccagctgca 5100 
cggtttgcgt attgggcgct cttccgcttc 5160 
tcggctgcgg cgagcggtat cagctcactc 522 0 
aggggataac gcaggaaaga acatgtgagc 52 80 
aaaggccgcg ttgctggcgt ttttccatag 534 0 
tcgacgctca agtcagaggt ggcgaaaccc 540 0 
ccctggaagc tccctcgtgc gctctcctgt 5460 
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tccgaccctg 
ttctcatagc 
ctgtgtgcac 
tgagtccaac 
tagcagagcg 
ctacactaga 
aagagttggt 
ttgcaagcag 
tacggggtct 
atcaaaaagg 
aagtatatat 
ctcagcgatc 
tacgatacgg 
ctcaccggct 
tggtcctgca 
aagtagttcg 
gtcacgctcg 
tacatgatcc 
cagaagtaag 
tactgtcatg 
ctgagaatag 
cgcgccacat 
actctcaagg 
ctgatcttca 
aaatgccgca 
ttttcaatat 
atgtatttag 
tgacgtagtt 
tcgccattca 
cgccagctgg 
tcccagtcac 
actagagggt 
aactagaatg 
tgtaaccatt 
ccccgcgctg 
cat agaaggc 
cgaagtgcac 



ccgcttaccg 
tcacgctgta 
gaaccccccg 
ccggtaagac 
aggtatgtag 
aggacagt at 
agctcttgat 
cagattacgc 
gacgctcagt 
atcttcacct 
gagtaaactt 
tgtctatttc 
gagggcttac 
ccagatttat 
actttatccg 
ccagttaata 
tegtttggta 
cccatgttgt 
ttggccgcag 
ccatccgt aa 
tgtatgcggc 
agcagaactt 
atcttaccgc 
gcatctttta 
aaaaagggaa 
tattgaagca 
aaaaataaac 
aacaaaaaaa 
ggctgcgcaa 
cgaaaggggg 
gacgt tgt aa 
cgacggtata 
cagtgaaaaa 
ataagctgca 
gaggatcatc 

ggcggtggaa 

9 



gatacctgtc 
ggtatctcag 
ttcagcccga 
acgacttatc 
gcggtgctac 
ttggtatctg 
ccggcaaaca 
gcagaaaaaa 
ggaacgaaaa 
agatcctttt 
ggtctgacag 
gttcatccat 
catctggccc 
cagcaataaa 
cctccatcca 
gtttgcgcaa 
tggcttcatt 
gcaaaaaagc 
tgttatcact 
gatgcttttc 
gaccgagttg 
taaaagtgct 
tgttgagatc 
ctttcaccag 
taagggcgac 
tttatcaggg 
aaataggggt 
agcccgccga 
ctgttgggaa 
atgtgctgca 
aacgacggcc 
cagacatgat 
aatgctttat 
ataaacaagt 
cagccggcgt 
tcgaaatctc 



cgcctttctc 
ttcggtgtag 
ccgctgcgcc 
gccactggca 
agagttcttg 
cgctctgctg 
aaccaccgct 
aggatctcaa 
ctcacgttaa 
aaattaaaaa 
ttaccaatgc 
agttgcctga 
cagtgctgca 
ccagccagcc 
gtctattaat 
cgttgttgcc 
cagctccggt 
ggttagctcc 
catggttatg 
tgtgactggt 
ctcttgcccg 
catcattgga 
cagttcgatg 
cgtttctggg 
acggaaatgt 
ttattgtctc 
tccgcgcaca 
agcgggcttt 
gggcgatcgg 
aggcgattaa 
agtccgtaat 
aagatacatt 
1 1 gtgaaat t 

tggggtgggc 

cccggaaaac 
gtagcacgtg 



ccttcgggaa 
gtcgttcgct 
ttatccggta 
gcagccactg 
aagtggtggc 
aagccagtta 
ggtagcggtg 
gaagatcctt 
gggattttgg 
tgaagtttta 
ttaatcagtg 
ctccccgtcg 
atgataccgc 
ggaagggccg 
tgttgccggg 
attgctacag 
tcccaacgat 
ttcggtcctc 
gcagcactgc 
gagtactcaa 
gcgtcaatac 
aaacgttctt 
taacccactc 
tgagcaaaaa 
tgaatactca 
atgagcggat 
tttccccgaa 
attaccaagc 
tgcgggcctc 
gttgggtaac 
acgactcact 
gatgagtttg 
tgtgatgcta 
gaagaactcc 
gattccgaag 
tcagtcctgc 



gcgtggcgct 
ccaagctggg 
actatcgtct 
gtaacaggat 
ctaactacgg 
ccttcggaaa 
gtttttttgt 
tgafccttttc 
tcatgagatt 
aatcaatcta 
aggcacctat 
tgtagataac 
gagacccacg 
agcgcagaag 
aagc t agag t 
gcatcgtggt 
caaggcgagt 
cgatcgttgt 
ataattctct 
ccaagtcatt 
gggataatac 
cggggcgaaa 
gtgcacccaa 
caggaaggca 
tactcttcct 
acatatttga 
aagtgccacc 
gaagcgccat 
ttcgctatta 
gccagggttt 
taaggccttg 
gacaaaccac 
ttgctttatt 
age at gaga t 
cccaaccttt 
tcctcggcca 



5520 
5580 
5640 
5700 
5760 
5820 
5880 
5940 
6000 
6060 
6120 
6X80 
6240 
6300 
6360 
6420 
6480 
6540 
6600 
6660 
6720 
6780 
6840 
6900 
6960 
7020 
7080 
7140 
7200 
7260 
7320 
7380 
7440 
7500 
7560 
7620 
7631 



<210> 117 

<211> 4615 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pl8attBZeo6XHS4 Plasmid 



<400> 117 

cagttgeegg 

gtcatggccg 

tacagctcgt 

tcctggaccg 

tccacgaagt 

tcgcgcgcgg 

caagttagta 

actctagtgg 

aagcgttcag 

ctgccggctc 

gctgccccct 

ccccgtgagc 

agaagegtte 

cgctgccggc 

ctgctgcccc 

gtccccgtga 

egagaagegt 

cacgctgccg 

cgctgctgcc 

ctgtccccgt 



ccgggtcgcg 
gcccggaggc 
ccaggccgcg 
cgctgatgaa 
cccgggagaa 
tgagcacegg 
taaaaaagca 
atcccccgcc 
aggaaagega 

ggggatgegg 

agegggggag 
ggatccgegg 
agaggaaagc 
t eggggat gc 
etageggggg 
gcggatccgc 
tcagaggaaa 
geteggggat 
ccctagcggg 
gageggat cc 



cagggegaac 
gtcccggaag 
cacccacacc 
cagggtcacg 
cccgagccgg 
aacggcactg 
ggcttcaatc 
ccgtatcccc 
tcccgtgcca 
ggggagegee 
ggaegtaatt 
ccccgtatcc 
gatcccgtgc 

ggggggagcg 

agggaegtaa 
ggccccgtat 
gcgatcccgt 
geggggggag 
ggagggacgt 
gcggccccgt 



tcccgccccc 
ttcgtggaca 
caggecaggg 
tcgtcccgga 
teggtccaga 
gtcaacttgg 
ctgcagagaa 
caggtgtctg 
ccttccccgt 
ggaceggage 
acatccctgg 
cccaggtgtc 
caccttcccc 
ccggaccgga 
ttacatccct 
cccccaggtg 
gccaccttcc 
cgccggaccg 
aattacatcc 
atcccccagg 



acggctgctc 
cgacctccga 
tgttgtccgg 
ccacaccggc 
actcgaccgc 
ccatggatcc 
gettgeatge 
caggctcaaa 
gcccgggctg 
ggagccccgg 
gggctttggg 
tgeaggctea 
gtgcccgggc 
gcggagcccc 

gggggctttg 

tetgeagget 
ccgtgcccgg 
gageggagee 
ctgggggctt 
tgtctgeagg 



gccgatctcg 
ccactcggcg 
caccacctgg 
gaagtegtec 
tccggcgacg 
agatttcget 
ctgeaggteg 
gagcagegag 
tccccgcacg 
gcggctcgct 

gggggrgctgt 

aagagcagcg 
tgtccccgca 
gggeggcteg 
gggggggg ct 

caaagagcag 
gctgtccccg 
ccgggcggct 

tggggggggg 
ctcaaagagc 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

102O 

1080 

1140 

1200 
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agcgagaagc gttcagagga aagcgatccc gtgccacctt ccccgtgccc gggctgtccc 12 60 
cgcacgctgc cggctcgggg atgcgggggg agcgccggac cggagcggag ccccgggcgg 1320 
ctcgctgctg ccccctagcg ggggagggac gtaattacat ccctgggggc tttggggggg 13 80 
ggctgtcccc gtgagcggat ccgcggcccc gtatccccca ggtgtctgca ggctcaaaga 1440 
gcagcgagaa gcgttcagag gaaagcgatc ccgtgccacc ttccccgtgc ccgggctgtc 1500 
cccgcacgct gccggctcgg ggafcgcgggg ggagcgccgg accggagcgg agccccgggc 1560 
ggctcgctgc tgccccctag cgggggaggg acgtaattac atccctgggg gctttggggg 162 0 
ggggctgtcc ccgtgagcgg atccgcggcc ccgtatcccc caggtgtctg caggctcaaa 16 80 
gagcagcgag aagcgttcag aggaaagcga tcccgtgcca ccttccccgt gcccgggctg 1740 
tccccgcacg ctgccggctc ggggatgcgg ggggagcgcc ggaccggagc ggagccccgg 18 0 0 
gcggctcgct gctgccccct agcgggggag ggacgtaatt acatccctgg gggctttggg 186 0 
ggggggctgt ccccgtgagc ggatccgcgg ggctgcagga attcgfcaatc atggtcatag 1920 
ctgtttcctg tgtgaaattg ttatccgctc acaattccac acaacatacg agccggaagc 1980 
ataaagtgta aagcctgggg tgcctaatga gtgagctaac tcacattaat tgcgttgcgc 2 040 
tcactgcccg ctttccagtc gggaaacctg tcgtgccagc tgcattaatg aatcggccaa 210 0 
cgcgcgggga gaggcggttt gcgtattggg cgctcttccg cttcctcgct cactgactcg 2160 
ctgcgctcgg tcgttcggct gcggcgagcg gtatcagctc actcaaaggc ggtaatacgg 2220 
ttatccacag aatcagggga taacgcagga aagaacatgt gagcaaaagg ccagcaaaag 22 80 
gccaggaacc gtaaaaaggc cgcgttgctg gcgtttttcc ataggctccg cccccctgac 2340 
gagcatcaca aaaatcgacg ctcaagtcag aggtggcgaa acccgacagg actataaaga 24 00 
taccaggcgt ttccccctgg aagctccctc gtgcgctctc ctgttccgac cctgccgctt 2460 
accggata.cc tgtccgcctt tctcccttcg ggaagcgtgg cgctttctca tagctcacgc 2520 
tgtaggtatc fccagttcggt gtaggtcgtt cgctccaagc tgggcfcgtgt gcacgaaccc 2 5 80 
cccgttcagc ccgaccgctg cgccttatcc ggtaactatc gtcttgagtc caacccggta 2640 
agacacgact tatcgccact ggcagcagcc actggtaaca ggattagcag agcgaggtat 2700 
gtaggcggtg ctacagagtt cttgaagtgg tggcctaact acggctacac tagaaggaca 27 60 
gtatttggta tctgcgctct gctgaagcca gttaccttcg gaaaaagagt tggtagctct 2 82 0 
tgatccggca aacaaaccac cgctggtagc ggtggttttt ttgtttgcaa gcagcagatt 2880 
acgcgcagaa aaaaaggatc tcaagaagat cctttgatct tttctacggg gtctgacgct 2 94 0 
cagtggaacg aaaactcacg ttaagggatt ttggtcatga gattatcaaa aaggatcttc 3 0 00 
acctagatcc ttttaaatta aaaatgaagt tttaaatcaa tctaaagtat atatgagtaa 3 0 60 
acttggtctg acagttacca atgcttaatc agtgaggcac ctatctcagc gatctgtcta 3120 
tttcgttcat ccatagttgc ctgactcccc gtcgtgtaga taactacgat acgggagggc 3180 
ttaccatctg gccccagtgc tgcaatgata ccgcgagacc cacgctcacc ggctccagat 324 0 
ttatcagcaa taaaccagcc agccggaagg gccgagcgca gaagtggtcc tgcaacttta 33 00 
tccgcctcca tccagtctat taattgttgc cgggaagcta gagtaagtag ttcgccagtt 33 60 
aatagtttgc gcaacgttgt tgccattgct acaggcatcg tggtgtcacg ctcgtcgttt 3420 
ggtatggctt cattcagctc cggttcccaa cgatcaaggc gagttacatg atcccccatg 34 80 
ttgtgcaaaa aagcggttag ctccttcggt cctccgatcg ttgtcagaag taagttggcc 3 54 0 
gcagtgttat cactcatggt tatggcagca ctgcataatt ctcttactgt catgccatcc 3 60 0 
gtaagatgct tttctgtgac tggtgagtac tcaaccaagt cattctgaga atagtgtatg 3 660 
cggcgaccga gttgctcttg cccggcgtca atacgggata ataccgcgcc acatagcaga 3 720 
actttaaaag tgctcatcat tggaaaacgt tcttcggggc gaaaactctc aaggatctta 3 780 
ccgctgttga gatccagttc gatgtaaccc actcgtgcac ccaactgatc ttcagcatct 3 840 
tttactttca ccagcgtttc tgggtgagca aaaacaggaa ggcaaaatgc cgcaaaaaag 3 900 
ggaataaggg cgacacggaa atgttgaata ctcatactct tcctttttca atattattga 3960 
agcatttatc agggttattg tctcatgagc ggatacatat ttgaatgtat ttagaaaaat 4020 
aaacaaatag gggttccgcg cacatttccc cgaaaagtgc cacctgacgt agttaacaaa 4 0 80 
aaaaagcccg ccgaagcggg ctttattacc aagcgaagcg ccattcgcca ttcaggctgc 4140 
gcaactgttg ggaagggcga tcggtgcggg cctcttcgct attacgccag ctggcgaaag 42 00 
ggggatgtgc tgcaaggcga ttaagttggg taacgccagg gttttcccag tcacgacgtt 42 60 
gtaaaacgac ggccagtccg taatacgact cacttaaggc cttgactaga gggtcgacgg 43 2 0 
tatacagaca fcgataagata cattgatgag tttggacaaa ccacaactag aatgcagtga 43 80 
aaaaaatgct ttatttgtga aafcttgtgat gctattgctt tatttgtaac cattataagc 444 0 
tgcaataaac aagttggggt gggcgaagaa ctccagcatg agatccccgc gctggaggat 450 0 
catccagccg gcgtcccgga aaacgattcc gaagcccaac ctttcataga aggcggcggt 45 60 
ggaatcgaaa tctcgtagca cgtgtcagtc ctgctcctcg gccacgaagt gcacg 4 615 

<210> 118 
<211> 17384 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pFK161 Plasmid 



<400> 118 
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gcgcacgagg 
gccacctctg 
aaaacgccag 
tgttctttcc 
ctgataccgc 
aagagcgctg 
gttgttgctc 
ggtgattcat 
aggagcacga 
gacaaaccac 
ttgctttatt 
attttatgtt 
acaaatgtgg 
attaacccct 
agcagacact 
atgcctactt 
tttttccttt 
actcaaaaaa 
tttfcggagga 
ttctgagcaa 
tccataggtt 
acttaaaaat 
tgfccacacca 
tccccactcc 
gccgacggat 
aaccaactcg 
cgcfcggagga 
aaggcggcgg 
tcgaacccca 
gcgaatcggg 
gctcttcagc 
gccggccaca 
aggcatcgcc 
gaacagttcg 
accggcttcc 
gcaggtagcc 
ctcggcagga 
ccagtccctt 
ggccagccac 
ggtcttgaca 
gcagccgatt 
agaacctgcg 
atcagatctt 
tttgcagggc 
tgtccataaa 
tctctttgcg 
tcagcaccgt 
ccctgagtgc 
tcctcactac 
gccatggggc 
gggcgggact 
ggagcctggg 
tctgcctgct 
gatctgcagg 
gcgatggata 
ttggctccaa 
tcgaggtggc 
cggcgcctac 
gacgatcagc 
ctgfcccctga 
atgccgccgg 
gccagcaaga 
gccgaaacgt 
gaataccgca 
aatgacccag 
aagtgcggcg 
tctcaagggc 



gagcttccag 
acttgagcgt 
caacgcggcc 
tgcgttatcc 
tcgccgcagc 
acttccgcgt 
aggtcgcaga 
tctgctaacc 
tcatgcgcac 
aactagaatg 
tgtaaccatt 
tcaggttcag 
tatggctgat 
ttacaaatta 
ctatgcctgt 
ataaaggtta 
gtggtgtaaa 
cttagcaatt 
gtagaatgfct 
aacaggtttt 
ggaatctaaa 
tttatattta 
cagaagtaag 
tgcagttcgg 
ttgcactgcc 
cgaggggatc 
tcatccagcc 
tggaatcgaa 
gagtcccgct 
agcggcgata 
aatafccacgg 
gtcgatgaat 
atgggtcacg 
gctggcgcga 
atccgagtac 
ggatcaagcg 
gc aagg t gag 
cccgcttcag 
gatagccgcg 
aaaagaaccg 
gtctgfctgtg 
tgcaatccat 
gatcccctgc 
ttcccaacct 
accgcccagt 
cttgcgtttt 
ttctgcggac 
ttgcggcagc 
ttctggaata 
ggagaatggg 
atggttgctg 
gactttccac 
ggggagcctg 
acccaacgct 
tgttctgcca 
ttcttggagt 
ccggctccat 
aatccatgcc 
ggtccaatga 
tggtcgtcafc 
aagcgagaag 
cgtagcccag 
ttggtggcgg 
agcgacaggc 
agcgctgccg 
acgatagtca 
atcggtcgac 



ggggaaacgc 
cgatttttgt 
tttttacggt 
cctgattctg 
cgaacgaccg 
ttccagactt 
cgttttgcag 
agtaaggcaa 
ccgtcagatc 
cagtgaaaaa 
ataagctgca 
ggggaggtgfc 
tatgatctct 
aaaagctaaa 
gtggagtaag 
cagaatattt 
tagcaaagca 
ctgaaggaaa 
gagagtcagc 
cctcattaaa 
atacacaaac 
ccttagagct 
gttccttcac 
gggcatggat 
ggtagaactc 
gagcccgggg 
ggcgtcccgg 
atctcgtgat 
cagaagaact 
ccgtaaagca 
gtagccaacg 
ccagaaaagc 
acgagatcct 
gcccctgatg 
gtgctcgctc 
tatgcagccg 
atgacaggag 
tgacaacgt c 
ctgcctcgtc 
ggcgcccctg 
cccagtcata 
cttgttcaat 
gccatcagat 
t ac c agaggg 
ctagctatcg 
cccttgtcca 
tggctttcta 
gtgaaagctt 
gcfccagaggc 
cggaactggg 
actaattgag 
acctggttgc 
gggactttcc 
gcccgagatg 
agggttggtt 
ggtgaatccg 
gcaccgcgac 
aacccgttcc 
tcgaagttag 
ctacctgcct 
aatcataatg 
cgcgtcgggc 
gaccagtgac 
cgatcatcgt 
gcacctgtcc 
tgccccgcgc 
gctctccctt 



ctggtatctt 
gatgctcgtc 
tcctggccfct 
tggataaccg 
agcgcagcga 
tacgaaacac 
cagcagtcgc 
ccccgccagc 
cagacatgat 
aatgctttat 
ataaacaagt 
gggaggtttt 
agtcaaggca 
ggtacacaat 
aaaaaacagt 
ttccataatt 
agcaagagtt 
gtccttgggg 
agtagcctca 
ggcattccac 
aattagaatc 
ttaaatctct 
aaagatccgg 
gcgcggatag 
gcgaggtcgt 
tgggcgaaga 
aaaacgattc 
ggcaggttgg 
cgtcaagaag 
cgaggaagcg 
ctatgtcctg 
ggccattttc 
cgccgtcggg 
ctcttcgtcc 
gatgcgatgt 
ccgcattgca 
atcctgcccc 
gagcacagct 
ctgcagttca 
cgctgaqagc 
gccgaatagc 
catgcgaaac 
ccttggcggc 
cgccccagct 
ccatgtaagc 
gatagcccag 
cgtgttccgc 
tttgcaaaag 
cgaggcggcc 
cggagttagg 
atgcatgctt 
tgactaattg 
acaccctaac 
cgccgcgtgc 
tgcgcattca 
ttagcgaggt 
gcaacgcggg 
atgtgctcgc 
gctggtaaga 
ggacagcatg 
gggaaggcca 
cgccatgccg 
gaaggcttga 
cgcgctccag 
tacgagttgc 
ccaccggaag 
atgcgactcc 



tatagtcctg 

aggggggcgg 

ttgctggcct 
tattaccgcc 
gtcagtgagc 
ggaaaccgaa 
ttcacgttcg 
ctagccgggt 
aagatacatt 
ttgtgaaatt 
taacaacaac 
tta aagc aag 
ctatacatca 
tfctfcgagcat 
atgttatgat 
ttcttgtata 
ctattactaa 
tcttctacct 
tcatcactag 
cactgctccc 
agtagtttaa 
gtaggtagtfc 
accaaagcgg 
ccgctgctgg 
ccagcctcag 
actccagcat 
cgaagcccaa 
gcgtcgcttg 
gcgatagaag 
gtcagcccat 
atagcggtcc 
caccatgata 
atgcgcgcct 
agatcatcct 
ttcgcttggt 
tcagccatga 
ggcacttcgc 
gcgcaaggaa 
ttcagggcac 
cggaacacgg 
ctctccaccc 
gatcctcatc 
aagaaagcca 
ggcaattccg 
ccactgcaag 
tagctgacat 
ttcctttagc 
cctaggcctc 
taaataaaaa 
ggcgggatgg 
tgcatacttc 
agatgcatgc 
tgacacacat 
ggctgctgga 
cagttctccg 
gccgccggct 
gaggcagaca 
cgaggcgcat 
gccgcgagcg 
gcctgcaacg 
tccagcctcg 
gcgataatgg 
gcgagggcgt 
cgaaagcggt 
atgataaaga 
gagctgactg 
tgcattagga 



tcggggtttc 
agcctatgga 
tttgctcaca 
tttgagtgag 
gaggaagcgg 
gaccattcat 
ctcgcgtatc 
cctcaacgac 
gatgagtttg 
tgtgatgcta 
aattgcattc 
taaaacctct 
aatattcctt 
agttattaat 
tataactgtt 
gcagtgcagc 
acacagcatg 
ttctcttctt 
atggcatttc 
attcatcagt 
cacattatac 
tgtccaatta 
ccatcgtgcc 
tttcctggat 
gcagcagctg 
gagatccccg 
cctttcatag 
gtcggtcatt 
gcgatgcgct 
tcgccgccaa 
gccacaccca 
ttcggcaagc 
tgagcctggc 
gatcgacaag 
ggtcgaatgg 
tggatacttt 
ccaatagcag 
cgcccgtcgt 
cggacaggtc 
cggcatcaga 
aagcggccgg 
ctgtctcttg 
tccagtttac 
gttcgcttgc 
ctacctgctt 
tcatccgggg 
agcccttgcg 
caaaaaagcc 
aaattagtca 
gcggagttag 
tgcctgctgg 
tttgcatact 
tccacagccg 
gatggcggac 
caagaattga 
tccattcagg 
aggtataggg 
aaatcgccgt 
atccttgaag 
cggcatcccg 
cgtcgcgaac 
cctgcttctc 
gcaagattcc 
cctcgccgaa 
agacagtcat 
ggttgaaggc 
agcagcccag 
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tagtaggttg 
gcccaacagt 
agcccgaagt 
accgcacctg 
gtcacagcat 
tgtagtaccc 
ctcatggtaa 
aattgaaaca 
cgatggtcaa 
ttgcgaactg 
caaatagtca 
catcgcacgc 
tcggcggctt 
attatcacgt 
ttgctgtctg 
gctgaaacca 
tcagaagggc 
tattgcgctt 
gacgagctgg 
tcattcatca 
cctcagccgg 
cagaacaagg 
attgaaacgt 
acaaagcaaa 
gccatcatga 
gcaattgata 
gcgacctcgc 
cttcgtcata 
tgctgaaagc 
aacaatggaa 
tcagaactgg 
gctttatgac 
cgaaaagctg 
cgggtgtggt 
ctgggcggcg 
caacgcatat 
atcccgcaag 
gacggtgc eg 
ttagcaattt 
atgagaattc 
cgtctgccgg 
cgcacttttc 
tcacgtgttt 
ccggtggcgt 
gaggtgetec 
gcgctcccca 
tgtctgagaa 
tegtegggtg 
ggtcgegget 
gagaggectg 
aatgcccttg 
ttggtcttct 
gtcggggttt 
ggaaagggtg 
ctcgccccct 
ggcctccccg 
ccgttgctgc 
gcacaccccc 
tgggtaggcg 
tccgtcgcgt 
ctgcgccgcg 
ccccccttcc 
cctcggggtc 
gttctgtggg 
gccgctcggg 
cggtgtcgcc 
Srgtgtggtgg 



aggccgttga 
cccccggcca 
ggcgagcccg 
tggcgccggt 
gcgcatatcc 
acategtcat 
tagtccatga 
aaagagatgg 
tgcgctggat 
ttcccaacta 
ggtaatgaat 
gcacaccgta 
tgctgtgcga 
tgtccggcgc 
gtgatctgee 
gacacacagc 
agaaatfctgc 
egatgacget 
accagcgcat 
aggacgccgc 
tgaccaatat 
taaccgtcag 
tgatcgaaaa 
tggcagcaga 
tggaatgttt 
attattatca 
gggttttcgc 
acttaatgtt 
gagctttttg 
gtcaacaaaa 
caggaacagg 
tctgccgccg 
egcegggagg 
cgccatgatc 
geaaageggt 
agegctagea 
aggcccggca 
aggatgacga 
aactgtgata 
gcggccgctc 
tggtgtgtgg 
tcagtggttc 
cactttggtc 
tgcataccct 
tggagcgttc 
ttccctggtg 
gecegtgaga 
aggcgcccac 

ggggttggaa 

getttegggg 
gaagagaacc 
ggtttccctg 

tgggtccgtc 
egggcttett 

gaccgcctcc 
ctccgagttc 
ggagcatgtg 
gcgtgcgcgt 
acggtgggct 
gcgtccctct 
cgtggtgcgt 
cgcggcagcg 
gagagggtcc 
agaaeggctg 
ggtcttcgtc 
tcctcgggct 
gaetgetcag 



gcaccgccgc 
cgggcctgcc 
atcttcccca 
gatgccggcc 
atgettcgae 
cgctttccac 
aaatccttgt 
tgatctttct 
atgggataga 
aaatcatttt 
cctgatataa 
gaaagtcttt 
caggctcacg 
ggegaeggat 
ttctaaatct 
aactgaatac 
cgttgaacac 
tggcgttgag 
tcgtgacacc 
tategcaaat 
ctacaacatc 
tgccgataag 
cgcgctgaaa 
caagaaagcg 
ccccggtggt 
tttgcgggtc 
tatttatgaa 
tttatttaaa 
gcctctgtcg 
agcagctggc 
gaatgcccgt 
tcataaaatg 
ttgaagaact 
gegtagtega 
cggacagtgc 
gcacgccata 
gtaceggcat 
tgagegcat t 
aactaccgca 
ttctcgttct 
aaggcagggg 
gcgtggtcct 
gtgtctcgct 
tcccgtctgg 
caggtttgtc 
tgcctccggt 

ggggggtcga 

cccgcgacta 
agtttctcga 
gggaccggtt 
ttcctgttgc 
tgtgctcgtc 
ccgccctcag 
aeggtctega 
cgcgcgcgca 
ggggagggat 
geteggcttg 
actttcctcc 
cccgggtccc 
cgctcgcgtc 
gctgtgtgct 
ttcccacggc 
gtgtctggcg 
ttggccgcgt 
ggtaggcatc 
cccggggggc 
gggagtggtg 



cgcaaggaat 
accataccca 
tcggtgatgt 
aegatgegtc 
catgcgctca 
tgctctcgcg 
attcataaat 
aagagatgat 
tgggaatatg 
gcacgatcag 
agacaggttg 
cagttgtgag 
tctaaaagga 
gttctgfcatg 
ggcacagccg 
cagaaagaaa 
ctggtcaata 
attgatacct 
gtctccttcg 
ggtgctatcc 
agccttggta 
ttcaaagtta 
aacgetgetg 
atggatgaac 
gttatctggc 
ctttccggcg 
aattttcegg 
ataccctctg 
tttcctttct 
tgacattttc 
tetgegagge 
gtatgccgaa 
gcggcaggcc 
tagtggctcc 
tccgagaacg 
gtgactggcg 
aaccaagcct 
gttagatttc 
ttaaagctta 
gccagcgggc 
tgcggctctc 
tgtggatgtg 
tgaccatgtt 
tgtgtgcacg 
tcctaggtgc 
gctccgtctg 
ggagagaagg 
gtacgcctgt 
gagactcatt 
gcagggtctc 
cgcagacccc 
gcatgcatcc 
tgagaaagtt 
ggggtctctc 
gcgtttgctc 
cacgcggggc 
tgtggttggt 
cctcctgagg 
cacccgtctt 
cacgactttg 
tetegggctg 
tggegaaate 
ttgattgatc 
ccggcgcgac 
ggtgtgtcgg 
cgtcgtgttt 
cagtgtgatt 



ggtgcatgca 
cgccgaaaca 
eggegatata 
egg eg t agag 
caaagtaggt 
aataaagatg 
cctccaggta 
ggaatctccc 
ctgattttta 
cgcactacga 
ataaatcagt 
cctgggcaaa 
aataaatcat 
cgctgttttt 
aattgegega 
atcactttac 
cgcgttttgg 
ctgctgcaca 
aacttattcg 
acgcagcggc 
tccagcgtga 
aacctggtgt 
aatgtgcggc 
tggcttccfca 
agcagtgccg 
atccgccttg 
tttaaggcgt 
aaaagaaagg 
ctgtttttgt 
ggtgcgagta 
ggtggcaagg 
agggatgetg 
agegaggcag 
aagtagcgaa 
ggtgcgcata 
atgctgtcgg 
atgcctacag 
atacaeggtg 
tcgatgataa 
cctcgtctct 
cggcccgacg 
tgaggcgccc 
cccagagtcg 
cgctgtttct 
ctgcttctga 
gctgtgtgcc 
aggggcaaga 
gegtaggget 
gctttcccgt 
ccctgtccgc 
cccgcgcggt 
tetcteggtg 
tccttctcta 
ccgaatggtc 
tctcgtctac 
agagcctgtc 
ggctggggag 
gccgccgtgc 
cccgtgcctc 
gccgctcccg 
tgtggttgtg 
gegggagtec 
tcgctctcgg 
gtcggacgtg 
catcggtctc 
egggtegget 
cccgccggtt 



aggagatggc 
agegctcatg 
ggcgccagca 
gatcttggca 
gaatgcgcaa 
gaaaatcaat 
getatatgea 
ttcagtatcc 
tgggacagag 
actttaccca 
cttctacgcg 
ccgttaactt 
ggg t c at aaa 
ccgtggcgcg 
gcttggtttt 
ctttctgaca 
tgagcagcaa 
aaaggcaatc 
caatggagtg 
aatcgaaaca 
tgagccagcg 
tgataccaac 
gctggatgtc 
tgtccgcacg 
t cga t ag t at 
ttacggggcg 
ttccgttctt 
aaacgacagg 
ccgtggaatg 
tccgtaccat 
gtaatgaggt 
aaa 1 t gagaa 
atccacagga 
gcgagcagga 
gaaattgeat 
aatggacgat 
catccagggt 
cctgactgcg 
geggtcaaac 
ccaccccatc 
ctgccccgcg 
ggttgtgccc 

gtggatgtgg 
tgtaagcgtc 
gctggtggtg 
ttcccgtttg 
ccccccttct 
ggtgctgagc 

ggggagcttt 

ggatgctcag 
cgcccgcgtg 
gccggggctc 
gctatcttcc 
c c c t ggaggg 
cgcggcccgc 
tgtcgtcctg 

agggctccgt 

ggacggggtg 
acccgtgcct 
cgacggcggc 
tcgcctcgcc 
tccttcccct 
ggaegggace 
gggacccact 
tctctcgtgt 
cggcgctgca 
ttgcctcgcg 
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tgccctgacc 
gaggggcccg 
ccccctcccc 
acccgtggcc 
cggt cac egg 
gagctgtggt 
gagagggctg 
agtggtcafct 
ccggccctgt 
accctggcgg 
gatgtctacc 
cctcgttcct 
catctctcgc 
tegceggggg 
ctcgccggct 
gaegttgege 
gagcccctgc 
tgtgtcgcgt 
gacgggtggc 
tcgttggtgt 
tcgccggtgt 
cggcccggtg 
gggaeggagg 
gttggcttfcg 
tccggccgca 
cctcccgcga 
cctggtcctg 
ggtagcatat 
agtgaaactg 
ctacttggat 
tecegggggg 
ctccggccgg 
acgccccccg 
tcgccgtgcc 
gagectgaga 
cgacccgggg 
ggaatgagtc 
cagccgcggt 
tagttggatc 
ccccttgcct 
gtttactttg 
aggaataatg 
t aagagggac 
geaagaegga 
teggaggtte 
egatgeggeg 
ggttccgggg 
ccaggagtgg 
eggacaggat 
ttcttagttg 
ctaactagtt 
gcgttcagcc 
tgcacgcgcg 
aacccgttga 
gaattcccag 
accgcccgtc 
ggtcggccca 
agtaaaagtc 
ctgtggagga 
cgcgtgcgtc 
gaaggggtgg 
tcccctctcc 
gcgtcttgcc 
ggtttttgac 
cccatccccg 
ggatgtgagt 
gtcctccccg 



ggtccgacgc 
tttcggccgc 
gctcgccgca 
gtgctgtcgg 
ggtcttgggg 
ttggagggcg 
cgfcgcgaggg 
gtcccgacgg 
cgtccgtcgg 
tgggattaac 
tccctctccc 
ccctctcgcg 
geaatggege 
ctggccgctg 
tcgcggactc 
cfccgcfcgctg 
cgcacccgcc 
egggagegtg 
ctatccaggg 

ggggagtgaa 

cgcgcttctc 
eggtcgaegt 
ggagagcggg 
ccgcgtgcgt 
tgcactctcc 
ggctctccgc 
tcccaccccc 
gcttgtctca 
cgaatggctc 
aactgtggta 
ggatgcgtgc 
gggtcgggcg 
tggeggegae 
taccatggtg 
aacggctacc 
aggfcagfcgac 
cactttaaat 
aattccagct 
tfcgggagcgg 
ctcggcgccc 
aaaaaattag 
gaataggacc 
ggceggggge 
ecagagegaa 
gaagacgatc 
gcgttattcc 
ggagtatggt 
gcctgcggct 
tgacagattg 
gtggagcgat 
acgcgacccc 
acccgagatt 
ctacactgac 
accccattcg 
taagtgcggg 
gctactaccg 
cggccctggc 
gtaacaaggt 
gcggcggcgt 
ccgggtcccg 

gtggggtcgg 

ctcgtccggc 
tctttcccgt 
ccgtcccggg 
ccgcggctct 
gtcgcgtgtg 
ctcctgtccc 



ccgagcggtc 
ccttgccgtc 
geeggtcttt 
accccccgca 

gggggecgag 

tcccggcccc 
gaaaaggttg 
tgtggtggtc 
gaaggcgcgt 
cccgcgcgcg 
cgaggtctca 
gggtfccaagt 
cgcccgagtfc 
tccggtctct 
ctggcttcgc 
tgtgcttggg 
ggtgtgcggt 
tccgcctcgc 
ctcgcccccg 
tggtgctacc 
tttccgccaa 
tccggctctc 
t aagagagg t 
g t gc t cgegg 
cgttccgcgc 
cgccgccgcc 
gacgctccgc 
aagattaagc 
attaaatcag 
atfccfcagagc 
atttatcaga 
ccggcggctt 
gacccattcg 
accaegggtg 
acatccaagg 
gaaaaataac 
cctttaacga 
ecaatagegt 
gcgggcggtc 
cctcgatgct 
agtgttcaaa 
gcggttctat 
attegtattg 
ageatttgee 
agatacegtc 
catgacccgc 
tgcaaagctg 
taatttgact 
atagctcttt 
fctgtctggtt 
cgagcggfccg 
gagcaataac 
tggctcagcg 
tgafcggggat 
tcataagctt 
attggatggt 
ggagcgctga 
ttccgtaggt 
ggcccgctct 
tcgcccgcgt 
tctgggtccg 
tctgacctcg 
ccggctcttc 
ggcgttcggt 
ggcttttcta 
ggctcgcccg 
gggtacctag 



tctcggtccc 
gtcgccggcc 
tttcctctct 
tgggggegge 
gggtaagaaa 
gcggccgtgg 
ccccgcgagg 
tgttggccga 
gttggggccfc 
tgtcccggtg 
ggccttctcc 
cgctcgtcga 
cacggtgggfc 
cctgcccgac 
ccggagggtc 

gggggcccgc 

ttcgcgccgc 
ggeggctaga 
ccgacccccg 
ggtcattccc 
cccccacgcc 
ccgatgccga 
gteggagage 
acgggfctfctg 
gagcgcccgc 
tcctcctcct 
tcgcgcttcc 
catgcatgtc 
ttatggttcc 
taafcacafcgc 
tcaaaaccaa 
ggtgactcta 
aacgtctgcc 
aeggggaate 
aaggcagcag 
aatacaggac 
ggatccattg 
atattaaagt 
cgccgcgagg 
cttagctgag 
gcaggcccga 
tttgttggtt 
cgccgctaga 
aagaatgttt 
gtagttccga 
egggcagefct 
aaacttaaag 
caacaeggga 
cfccgattccg 
aattccgata 
gcgtccccca 
aggtctgtga 
tgtgcctacc 
eggggattge 
gcgtfcgatta 
ttagtgaggc 
gaagacggtc 
gaacctgegg 
ccccgtcttg 
gtggagcgag 
tctgggaccg 
ccaccctacc 
cgtgtctacg 
cgtcggggcg 
cgttggctgg 

tcccgatgcc 
ctgtcgcgtt 



ttgtgaggac 
ctcgttctgc 
ccccccctct 
egggcaegta 
gteggctegg 
cggtgtcttg 
gcaaagggaa 
ggtgcgtctg 
gccggagtgc 
tggcggtggg 
gcgcgggctc 
cctcccctcc 
tcgtcctccg 
ccccgttggc 
agggggcttc 
tgcggcctcc 
ggtcagttgg 
cgcgggtgtc 
cctgcccgtc 
tcccgcgtgg 
aacccaccac 

ggggttcggg 

tgtcccgggg 
t cggaccccg 
ccggctcacc 
ctctcgcgct 
ttacctggtt 
taagtacgea 
tttggtcgct 
cgacgggcgc 
cccggtgagc 
gataacctcg 
ctatcaactt 
agggttcgat 
gcgcgcaaat 
tctttcgagg 
gagggcaagt 
tgctgcagtt 
cgagtcaccg 
tgtcccgcgg 
gccgcctgga 
ttcggaactg 
ggtgaaattc 
tcattaatca 
ccataaacga 
ccgggaaacc 
gaattgaegg 
aacctcaccc 
tgggtggfcgg 
acgaacgaga 
acttcttaga 
tgeccttaga 
ctgcgccggc 
aattattccc 
agtccctgcc 
cctcggatcg 
gaacttgact 
aaggatcatt 
tgtgtgtcct 
gtgtctggag 
cctccgattt 
geggeggegg 
aggggeggta 
cgcgctttgc 
ggcggttgtc 
aegcttttet 
ccggcgcgga 



ccccttccgg 
tgtgtcgttc 
cctctgactg 
cgcgtccggg 
egggegggag 
cgcggtcttg 
agaggctagc 
gggggctcgt 
cgaggtgggt 
ggctccggtc 
tcggccctcc 
tccgtccttc 
cctccgcttc 
gtggtcttct 
ccggttcccc 
gcccgcccgt 
gccctggcgt 
gccgggctcc 
ccggtggtgg 
tttgactgtc 
cctgctctcc 
atttgtgccg 
cgacgctcgg 
aeggggtegg 
cccggtttgt 
ctctgtcccg 
gatcctgcca 
cggccggtac 
cgctcctctc 
tgacccccct 
tccctcccgg 
ggccgatcgc 
tcgatggtag 
teeggagagg 
tacccactcc 
ccctgtaatt 
ctggtgccag 
aaaaagctcg 
cccgtccccg 
ggcccgaagc 
taccgcagct 
aggecatgat 
ttggaccggc 
agaacgaaag 
tgccgactgg 
aaagtctttg 
aagggcacca 
ggcccggaca 
tgcatggccg 
ctctggcatg 
gggacaagtg 
tgtcegggge 
aggegegggt 
catgaacgag 
ctttgtacac 
gccccgccgg 
atctagagga 
aaaegggaga 
egcegggagg 
tgaggtgaga 
cccctccccc 
ctgctcgcgg 
cgtcgttacg 
tctcccggca 
gcgtgtgggg 
ggcctcgcgt 
ggtttaagga 



8100 

8160 

8220 

8280 

8340 

8400 

8460 

8520 

8580 

8640 

8700 

8760 

8820 

8880 

8940 

9000 

9060 

9120 

9180 

9240 

9300 

9360 

9420 

9480 

9540 

9600 

9660 

9720 

9780 

9840 

9900 

9960 

±0020 

10080 

10140 

1O200 

1O260 

10320 

10380 

10440 

10500 

10560 

10620 

10680 

10740 

10800 

10860 

10920 

10980 

11040 

11100 

11160 

11220 

11280 

11340 

11400 

11460 

11520 

11580 

11640 

11700 

11760 

11820 

11880 

1194 0 

12000 

12060 
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ccccgggggg 
cggtcgttcg 
cccgaggcgg 
cccgacccgc 
gggttcccgt 
cacgtgtcfcc 
cctctctctc 
cgtgagttcg 
tgcgtcgatg 
catcgacact 
cgtcggttga 
ctcgcagggc 
gggcggttgt 

cgcgctcgcg 
gcctcgcgtc 
tgggaaccca 
gaggttggcg 
ggttgtcggg 
gttfcgggfccfc 
ggcgccgcgc 
gtatccccgg 
cctcggtggg 
cgfcggctctfc 
ccgcgggacg 
gggagggaga 
ctgtgggctg 
ccctcccgcc 
gccgggtgcc 
tgtcccccct 
attagtcagc 
gaagagccca 
gacccactcc 
tggacggfcgt 
gfctgcfctggg 
cgagaccgat 
tcaagagggc 
gattcaaccc 
ccccgttcct 
gcctccggcg 
gggtcggcgg 
ggcggtgcgc 

gggggggcgg 

ggccgcgctt 
ctctcccccc 
ggcgcgaccg 
cggactgtcc 
gtcacgcgtc 
cgacccgtct 
gaaagccgcc 
cgaggcctct 
aggtggagca 
cgaagccaga 
cgacctgggt 
tttccctcag 
aatgattaga 
agaagcccgg 
ttggtaagca 
gacgctcatc 
gaagtcggaa 
aatggatggc 
ggacgggagc 
aggttaatgt 
tgcgcggaac 
gacaataacc 
atttccgtgt 
agaaacgctg 
cgaactggat 



gtcgccctgc 
ggcggctctc 
cggtcgtgtg 
gccgccggct 
gtcgttcccg 
gtttcgfctcc 
cggggagagg 
ctcacacccg 
aagaacgcag 
tcgaacgcac 
cgatcaatcg 
caacccccca 
cggtgtggcg 
gcttcttccc 
ggcgcctccc 
ccgcgccccc 
gtfcgagggtg 
gtggcggtcg 
tgcgctgggg 
accctccggc 
tggcgttgcg 
cgccttcgcg 
cttcgtctcc 
ccgcggcgtc 
gggcctcgct 
tgcgtcccgg 
ggcctctcgg 
gtctctttcc 
fcfcctgaccgc 
ggaggaaaag 
gcgccgaatc 
ccggcgccgc 
gaggccggta 
aatgcagccc 
agtcaacaag 
gtgaaaccgt 
ggcggcgcgc 
cccgaccccfc 
gcgggcgcgg 
gggaccgccc 
cgcgaccggc 
cgcgtctcag 
tcgccgaatc 
gtccgcctcc 
ctctcccacc 
ccagtgcgcc 
tcccgacgaa 
tgaaacacgg 
gtggcgcaat 
ccagtccgcc 
cgagcgtacg 
ggaaactctg 
ataggggcga 
gatagctggc 
ggtcttgggg 
ctcgctggcg 
gaactggcgc 
agaccccaga 
tccgctaagg 
gctggagcgt 
ggccgcgaat 
catgataata 
ccctatttgt 
ctgataaatg 
cgcccttatt 
gtgaaagtaa 
ctcaacagcg 



cgcccccagg 
cctcagactc 

ggggggtgga 

tgcccgattt 
tgtttttccg 
tgcfcggccgg 
agggcggtgg 
aaataccgat 
ctagctgcga 
ttgcggcccc 
cgtcacccgc 
acccgggtcg 
cgcgcgcccg 
gctccgccgt 
ggaccgctgc 
gtggcgcccg 
tgcgtgcgcc 
acgagggccg 
gaggcggggt 
ttgtgtggag 
agggagggt t 
ccgcacgcgg 
gcttctcctt 
cgtgcgccga 
gacccgttgc 
gggttgcgtg 
ggaccccctg 
cgcccgcctc 
gacctcagat 
aaactaacca 
cccgccgcgc 
tcgtgggggg 
gcggccccgg 
aaagcgggtg 
taccgtaagg 
taagaggtaa 
gtccggccgt 
ccacccgcgc 
ggggtggtgt 
ccggccggcg 
tccgggacgg 
ggcgcgccga 
ccggggccga 
cgggcgggcg 
cccctccgtc 
ccgggcgtcg 
gccgagcgca 
accaaggagt 
gaaggtgaag 
gagggcgcac 
cgttaggacc 
gtggaggtcc 
aagactaatc 
gctctcgctc 
ccgaaacgat 
tggagccggg 
tgcgggatga 
aaaggtgttg 
agtgtgtaac 
cgggcccata 
tcttgaagac 
atggtttctt 
ttafcttttct 
cttcaataat 
cccttttttg 
aagafcgctga 
gtaagatcct 



gtcggggggc 
catgaccctc 
tgtctggagc 
ccgcgggtcg 
ctcccgaccc 
cctgaggcta 
tcgttggggg 
acgactctta 
gaattaatgt 

gggttcctcc 

tgcggtgggt 
ggccctccgt 
cgtcgcggag 
tcccgccctc 
ctcaccagtc 

ggggtgggcg 

gaggtggtgg 
gtcggfccgcc 
cgaccgctcg 
ggagagcgag 
tggcgtcccg 
ccgctagggg 
cacccgggcg 
tgcgagtcac 
gtcccggctt 
tgagtaagat 
agacggttcg 
ctcgctctct 
cagacgtggc 
ggattccctc 
gtcgcggcgt 
cccaagtcct 
cgcgccgggc 
gtaaactcca 
gaaagttgaa 
acgggfcgggg 
gcccggtggt 
gtcgttcccc 
ggtggtggcg 
accggccgcc 
ccgggaaggc 
accacctcac 
ggaagccaga 

tgggggtggg 

gcctctctcg 
tcgcgccgtc 
cggggfccggc 
ctaacgcgtg 
ggccccgccc 
caccggcccg 
cgaaagatgg 
gtagcggtcc 
gaaccatcta 
ccgacgtacg 
ctcaacctat 
cgtggaatgc 
accgaacgcc 
gttgatatag 
aactcacctg 
cccggccgtc 
gaaagggcct 
agacgtcagg 
aaatacattc 
attgaaaaag 
cggcattttg 
agatcagttg 
tgagagtttt 



ggtggggccc 
ctccccccgc 
cccctcgggc 
gtcctgtcgg 
tttttttttc 
cccctcggtc 
actgtgccgt 
gcggtggatc 
gaattgcagg 

cggggctacg 

gctgcgcggc 
ctcccgaagt 
cctggtctcc 
gcccgtgcac 
tttctcggtc 
cgtccgcatc 
tcggtcccct 
tgcggtggtt 
cggggttggc 
ggcgagaacg 
cgtccgtccg 
cggtcggggc 
gtacccgctc 
ccccgggfcgfc 
ccctgggggg 
cctccacccc 
ccggctcgtc 
tcttcccgcg 
gacccgctga 
agtaacggcg 
gggaaatgtg 
tctgatcgag 
tcgggtcttc 
tctaaggcta 
aagaactttg 
tccgcgcagt 
cccggcggat 
tcttcctccc 
cgcgggcggg 
gccgggcgca 
ccggtgggga 
cccgagtgtt 
tacccgtcgc 
ggccgggccg 
gggcccggtg 
gggtcccggg 
ggcgatgfccg 
cgcgagtcag 

gggggcccga 

tctcgcccgc 
tgaactatgc 
tgacgtgcaa 
gtagcfcggtfc 
cagttttatc 
tctcaaactt 
gagtgcctag 
gggttaaggc 
acagcaggac 
ccgaatcaac 
gccgcagtcg 
cgtgatacgc 

tggcactttt 

aaatatgtat 
gaagagfcatg 
cttcctgttt 
ggtgcacgag 
cgccccgaag 



gtagggaagt 
tgccgccgtt 
gccgtggggg 
tgccggtcgt 
ctccccccca 
catctgttct 
cgtcagcacc 
actcggctcg 
acacattgat 
cctgfcctgag 

tgggagtttg 

tcagacgtgt 
cccgcgcatc 
cccggtcctg 
ccgtgccccg 
tgctctggtc 
gcggccgcgg 
gtctgtgtgt 
gcggtcgccc 
gagagaggtg 
tccctccctc 
ccgtggcccc 
cggcgccggc 
tgcgagttcg 
gacccggcgt 
cgccgccctc 
ctcccgtgcc 
gctgggcgcg 
atttaagcat 
agtgaacagg 
gcgtacggaa 
gcccagcccg 
ccggagtcgg 
aataccggca 
aagagagagt 
ccgcccggag 
ctttcccgct 
cgcgtccggc 
gccgggggtg 
cttccaccgt 
aggtggctcg 
acagccctcc 
cgcgctctcc 
cccctcccac 
gggggcgggg 
gggaccgtcg 
gctacccacc 
gggctcgt cc 

ggtgggatcc 

cgcgccgggg 
ttgggcaggg 
atcggtcgtc 
ccctccgaag 
cggtaaagcg 
taaatgggta 
tgggccactt 
gcccgatgcc 
ggtggccatg 
tagccctgaa 
gaacggaacg 
ctatttttat 
cggggaaatg 
ccgctcatga 
agtattcaac 
ttgctcaccc 
tgggttacat 
aacgttttcc 



12120 
12180 
12240 
12300 
12360 
12420 
12480 
12540 
12600 
12660 
12720 
12780 
12840 
12900 
12960 
13020 
13080 
13140 
13200 
13260 
13320 
13380 
13440 
13500 
13560 
13620 
13680 
13740 
13800 
13860 
13920 
13980 
14040 
14100 
14160 
14220 
14280 
14340 
14400 
14460 
14520 
14580 
14640 
14700 
14760 
14820 
14880 
14940 
15000 
15060 
15120 
15180 
15240 
15300 
15360 
15420 
15480 
15540 
15600 
15660 
15720 
15780 
15840 
15900 
15960 
16020 
16080 
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aatgatgagc 
gcaagagcaa 
agtcacagaa 
aaccatgagt 
gctaaccgct 
ggagctgaat 
aacaacgttg 
aatagactgg 
tggctggttt 
agcactgggg 
ggcaactatg 
ttggtaactg 
ttaatttaaa 
acgtgagttt 
agatcctttt 

ggfcggtttgt 

cagagcgcag 
gaactctgta 
cagtggcgat 
gcagcggtcg 
caccgaactg 
aaggcggaca 



acttttaaag 
ctcggtcgcc 
aagcatctta 
gataacactg 
tttttgcaca 
gaagccatac 
cgcaaactat 
atggaggcgg 
attgctgata 
ccagatggta 
gatgaacgaa 
tcagaccaag 
agga t c t agg 
tcgttccact 
tttctgcgcg 
ttgccggatc 
ataccaaata 
gcaccgccta 
aagtcgtgtc 
ggctgaacgg 
agatacctac 
ggtatccggt 



ttctgctatg 
gcatacacta 
cggatggcat 
cggccaactt 
acatggggga 
caaacgacga 
taactggcga 
ataaagttgc 
aatctggagc 
agccctcccg 
atagacagat 
tttactcata 
tgaagatcct 
gagcgtcaga 
taatctgctg 
aagagctacc 
ctgtccttct 
catacctcgc 
ttaccgggtt 
ggggttcgtg 
agcgtgagct 
aagcggcagg 



fcggcgcggta 
ttctcagaat 
gacagtaaga 
acttctgaca 
tcatgtaact 
gcgtgacacc 
actacttact 
aggaccactt 
cggtgagcgt 
tatcgtagtt 
cgctgagata 
tatactttag 
ttttgataat 
ccccgtagaa 
cttgcaaaca 
aactcttttt 
agtgtagccg 
tctgctaatc 
ggactcaaga 
cacacagccc 
atgagaaagc 
gtcggaacag 



ttatcccgtg 
gacttggttg 
gaattatgca 
acgatcggag 
cgccttgatc 
acgatgcctg 
ctagcttccc 
ctgcgctcgg 
gggtctcgcg 
atctacacga 
ggtgcctcac 
attgatttaa 
ctcatgacca 
aagatcaaag 
aaaaaaccac 
ccgaaggtaa 
tagttaggcc 
ctgttaccag 
cgatagttac 
agcttggagc 
gccacgcttc 
gaga 



ttgacgccgg 
agtactcacc 
gtgctgccat 
gaccgaagga 
gttgggaacc 
cagcaatggc 
ggcaacaatt 
cccttccggc 
gtatcattgc 
cggggagtca 
tgattaagca 
aacttcattt 
aaatccctta 
gatctfccttg 
cgctaccagc 
ctggcttcag 
accacttcaa 
tggctgctgc 
cggataaggc 
gaacgaccta 
cgaagggaga 



<210> 119 
<211> 2814 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pLITMUS38 Plasmid 



<400> 119 

gttaactacg 

tttctaaata 

ataatattga 

ttttgcggca 

tgctgaagat 

gatccttgag 

gctatgtggc 

acactattct 

tggcatgaca 

caacttactt 

gggggatcat 

cgacgagcgt 
tggcgaacta 
agttgcagga 
tggagccggt 
ctcccgtatc 
acagatcgct 
ctcatatata 
aagattgtat 
aatttttgtt 
aaatcaaaag 
ctattaaaga 
ccactacgtg 
aatcggaacc 
gaaaggaagg 
cgctgcgcgt 
atctaggtga 
ttccactgag 
ctgcgcgtaa 
ccggatcaag 
ccaaatactg 
ccgcctacat 
tcgtgtctta 
tgaacggggg 
tacctacagc 



tcaggtggca 
cattcaaata 
aaaaggaaga 
ttttgccttc 
cagttgggtg 
agttttcgcc 
gcggtattat 
cagaatgact 
gtaagagaat 
ctgacaacga 
gtaactcgcc 
gacaccacga 
cttactctag 
ccacttctgc 
gagcgtgggt 
gtagttatct 
gagataggtg 
ctttagattg 
aagcaaatat 
aaatcagctc 
aatagcccga 
acgtggactc 
aaccatcacc 
ctaaagggag 
gaagaaagcg 
aaccaccaca 
agatcctttt 
cgtcagaccc 
tctgctgctt 
agctaccaac 
ttcttctagt 
acctcgctct 

gttcgtgcac 
gtgagctatg 



cttttcgggg 
tgtatccgct 
gtatgagtat 
ctgtttttgc 
cacgagtggg 
ccgaagaacg 
cccgtgttga 
tggttgagta 
tatgcagtgc 
tcggaggacc 
ttgatcgttg 
tgcctgtagc 
cttcccggca 
gctcggccct 
ctcgcggtat 
acacgacggg 
cctcactgat 
atttaccccg 
ttaaattgta 
attttttaac 
gatagggttg 
caacgtcaaa 
caaatcaagt 
cccccgattt 
aaaggagcgg 
cccgccgcgc 
tgataatctc 
cgtagaaaag 
gcaaacaaaa 
tctttttccg 
gtagccgtag 
gctaatcctg 
ctcaagacga 
acagcccagc 
agaaagcgcc 



aaatgtgcgc 
catgagacaa 
tcaacatttc 
tcacccagaa 
ttacatcgaa 
ttctccaatg 
cgccgggcaa 
ctcaccagtc 
tgccataacc 
gaaggagcta 
ggaaccggag 
aatggcaaca 
acaattaata 
tccggctggc 
cattgcagca 
gagtcaggca 
taagcattgg 
gttgataatc 
aacgttaata 
caataggccg 
agtgttgttc 
gggcgaaaaa 
tttttggggt 
agagcttgac 
gcgctagggc 
ttaatgcgcc 
atgaccaaaa 
atcaaaggat 
aaaccaccgc 
aaggtaactg 
ttaggccacc 
ttaccagtgg 
tagttaccgg 
ttggagcgaa 
acgcttcccg 



ggaaccccta 
taaccctgat 
cgtgtcgccc 
acgctggtga 
ctggatctca 
atgagcactt 
gagcaactcg 
acagaaaagc 
atgagtgata 
accgcttttt 
ctgaatgaag 
acgttgcgca 
gactggatgg 
tggtttattg 
ctggggccag 
actatggatg 
taactgtcag 
agaaaagccc 
ttttgttaaa 
aaat cggcaa 
cagtttggaa 
ccgtctatca 
cgaggtgccg 
ggggaaagcg 
gctggcaagt 
gctacagggc 
tcccttaacg 
cttcttgaga 
taccagcggt 
gcttcagcag 
acttcaagaa 
ctgctgccag 
ataaggcgca 
cgacctacac 
aagggagaaa 



tttgtttatt 
aaatgcttca 
ttattccctt 
aagtaaaaga 
acagcggtaa 
ttaaagttct 
gtcgccgcat 
atcttacgga 
acactgcggc 
tgcacaacat 
ccataccaaa 
aactattaac 
aggcggataa 
ctgataaatc 
atggtaagcc 
aacgaaatag 
accaagttta 
caaaaacagg 
attcgcgtta 
aatcccttat 
caagagtcca 
gggcgatggc 
taaagcacta 
aacgtggcga 
g t agcggt c a 
gcgtaaaagg 
tgagttttcg 
tccttttttt 
ggtttgtttg 
agcgcagata 
ctctgtagca 
tggcgataag 
gcggtcgggc 
cgaac t gaga 
ggcggacagg 



16140 
16200 
16260 
16320 
16380 
16440 
16500 
16560 
16620 
16680 
16740 
16800 
16860 
16920 
16980 
17040 
17100 
17160 
17220 
17280 
17340 
17384 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

150O 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 
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tatccggtaa gcggcagggt cggaacagga gagcgcacga gggagcttcc agggggaaac 2160 

gcctggtatc tttatagtcc tgtcgggttt cgccacctct gacttgagcg tcgatttttg 2220 

tgatgctcgt caggggggcg gagcctatgg aaaaacgcca gcaacgcggc ctttttacgg 2280 

ttcctggcct tttgctggcc ttttgctcac atgtaatgtg agttagctca ctcattaggc 2 340 

accccaggct ttacacttta tgcttccggc tcgtatgttg tgtggaattg tgagcggata 2400 

acaatttcac acaggaaaca gctatgacca tgattacgcc aagctacgta atacgactca 24 60 

ctagtggggc ccgtgcaatt gaagccggct ggcgccaagc ttctctgcag gatatctgga 2520 

tccacgaatt cgctagcttc ggccgtgacg cgtctccgga tgtacaggca tgcgtcgacc 2580 

ctctagtcaa ggccttaagt gagtcgtatt acggactggc cgtcgtttta caacgtcgtg 2 640 

actgggaaaa ccctggcgtt acccaactta atcgccttgc agcacatccc cctttcgcca 2700 
gctggcgtaa tagcgaagag gcccgcaccg atcgcccttc ccaacagttg cgcagcctga 2760 

atggcgaatg gcgcttcgct tggtaataaa gcccgcttcg gcgggctttt tttt 2 814 

<210> 120 
<211> 2847 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pLIT38attB Plasmid 
<400> 120 

gttaactacg tcaggtggca cttttcgggg aaatgtgcgc ggaaccccta tttgtttatt 60 
tttctaaata cattcaaata tgtatccgct catgagacaa taaccctgat aaatgcttca 12 0 
ataatattga aaaaggaaga gtatgagtat tcaacatttc cgtgtcgccc ttattccctt 180 
ttttgcggca ttttgccttc ctgtttttgc tcacccagaa acgctggtga aagtaaaaga 24 0 
tgctgaagat cagttgggtg cacgagtggg ttacatcgaa ctggatctca acagcggtaa 3 00 
gatccttgag agttttcgcc ccgaagaacg ttctccaatg atgagcactt ttaaagttct 3 60 
gctatgtggc gcggtattat cccgtgttga cgccgggcaa gagcaactcg gtcgccgcat 420 
acactattct cagaatgact tggttgagta ctcaccagtc acagaaaagc atcttacgga 4 80 
tggcatgaca gtaagagaat tatgcagtgc tgccataacc atgagtgata acactgcggc 54 0 
caacttactt ctgacaacga tcggaggacc gaaggagcta accgcttttt tgcacaacat 60 0 
gggggatcat gtaactcgcc ttgatcgttg ggaaccggag ctgaatgaag ccataccaaa 660 
cgacgagcgt gacaccacga tgcctgtagc aatggcaaca acgttgcgca aactattaac 720 
tggcgaacta cttactctag cttcccggca acaattaata gactggatgg aggcggataa 780 
agttgcagga ccacttctgc gctcggccct tccggctggc tggtttattg ctgataaatc 84 0 
tggagccggt gagcgtgggt ctcgcggtat cattgcagca ctggggccag atggtaagcc 900 
ctcccgtatc gtagttatct acacgacggg gagtcaggca actatggatg aacgaaatag 960 
acagatcgct gagataggtg cctcactgat taagcattgg taactgtcag accaagttta 1020 
ctcatatata ctttagattg atttaccccg gttgataatc agaaaagccc caaaaacagg 10 80 
aagattgtat aagcaaatat ttaaattgta aacgttaata ttttgttaaa attcgcgtta 1140 
aatttttgtt aaatcagctc attttttaac caataggccg aaatcggcaa aatcccttat 12 00 
aaatcaaaag aatagcccga gatagggttg agtgttgttc cagtttggaa caagagtcca 1260 
ctattaaaga acgtggactc caacgtcaaa gggcgaaaaa ccgtctatca gggcgatggc 132 0 
ccactacgtg aaccatcacc caaatcaagt tttttggggt cgaggtgccg taaagcacta 13 8 0 
aatcggaacc ctaaagggag cccccgattt agagcttgac ggggaaagcg aacgtggcga 144 0 
gaaaggaagg gaagaaagcg aaaggagcgg gcgctagggc gctggcaagt gtagcggtca 15 0 0 
cgctgcgcgt aaccaccaca cccgccgcgc ttaatgcgcc gctacagggc gcgtaaaagg 156 0 
atctaggtga agatcctttt tgataatctc atgaccaaaa tcccttaacg tgagttttcg 162 0 
ttccactgag cgtcagaccc cgtagaaaag atcaaaggat cttcttgaga tccttttttt 1680 
ctgcgcgtaa tctgctgctt gcaaacaaaa aaaccaccgc taccagcggt ggtttgtttg 174 0 
ccggatcaag agctaccaac tctttttccg aaggtaactg gcttcagcag agcgcagata 1800 
ccaaatactg ttcttctagt gtagccgtag ttaggccacc acttcaagaa ctctgtagca 186 0 
ccgcctacat acctcgctct gctaatcctg ttaccagtgg ctgctgccag tggcgataag 192 0 
tcgtgtctta ccgggttgga ctcaagacga tagttaccgg ataaggcgca gcggtcgggc 1980 
tgaacggggg gttcgtgcac acagcccagc ttggagcgaa cgacctacac cgaactgaga 204 0 
tacctacagc gtgagctatg agaaagcgcc acgcttcccg aagggagaaa ggcggacagg 210 0 
tatccggtaa gcggcagggt cggaacagga gagcgcacga gggagcttcc agggggaaac 2160 
gcctggtatc tttatagtcc tgtcgggttt cgccacctct gacttgagcg tcgatttttg 222 0 
tgatgctcgt caggggggcg gagcctatgg aaaaacgcca gcaacgcggc ctttttacgg 22 8 0 
ttcctggcct tttgctggcc ttttgctcac atgtaatgtg agttagctca ctcattaggc 2340 
accccaggct ttacacttta tgcttccggc tcgtatgttg tgtggaattg tgagcggata 2400 
acaatttcac acaggaaaca gctatgacca tgattacgcc aagctacgta atacgactca 2460 
ctagtggggc ccgtgcaatt gaagccggct ggcgccaagc ttctctgcag gattgaagcc 2 52 0 
tgctttttta tactaacttg agcgaaatct ggatccacga attcgctagc ttcggccgtg 2580 
acgcgtctcc ggatgtacag gcatgcgtcg accctctagt caaggcctta agtgagtcgt 2 640 
attacggact ggccgtcgtt ttacaacgtc gtgactggga aaaccctggc gttacccaac 2700 
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ttaatcgcct tgcagcacat ccccctttcg ccagctggcg taatagcgaa gaggcccgca 2760 
ccgatcgccc ttcccaacag ttgcgcagcc tgaatggcga atggcgcttc gcttggtaat 2 82 0 
aaagcccgct tcggcgggct ttttttt 2 847 

<210> 121 

<211> 4223 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pLIT38at:tBBSRpolyA2 Plasmid 



<400> 121 

accatgaaaa 

aagattacaa 

acaggagaaa 

gcagaagcca 

gtagctgtfca 

tgtggtatgt 

afcgaatggca 

aattaaaagt 

ttctaccggc 

catgcccccg 

aaggaacctt 

tctaaggtaa 

tgtgtatttt 

aatgaggaaa 

gactctcaac 

ccfcfccagaat 

tttgctatfct 

tattctgtaa 

actccacaca 

agctttttaa 

gatcataatc 

cctccccctg 

agcttataat 

tttcgctcaa 

cggcttcaat 

tcatagctgt 

ggaagcataa 

agcaaaaggc 

taggctccgc 

cccgacagga 

tgttccgacc 

gctttctcat 

gggctgtgtg 

tctfcgagtcc 
gattagcaga 
cggctacact 
aaaaagagfct 
tgtttgcaag 
ttctacgggg 
attatcaaaa 
cggcgggtgt 
ctcctttcgc 
tcgggggctc 
tgafcttgggt 
gacgttggag 
ccctatctcg 
aaaaaatgag 
aatttaaata 
ggtaaatcaa 
agtgaggcac 
gtcgtgtaga 
ccgcgagacc 
gccgagcgca 
cgggaagcta 



catttaacat 
tgctttatga 
tcatttcggc 
ttgcgattgg 
gacaccctta 
gtagggagtt 
agttagtcaa 
tttaccatac 
agtgcaaatc 
aactgcagga 
acttctgtgg 
atataaaatt 
agattccaac 
acctgttfcfcg 
attctactcc 
tgctaagttt 
acaccacaaa 
cctttataag 
ggcatagagt 
tttgtaaagg 
agccatacca 
aacctgaaac 
ggttacaaat 
gttagtataa 
tgcacgggcc 
ttcctgtgtg 
agtgtaaagc 
cagcaaaagg 
ccccctgacg 
ctataaagat 
ctgccgctta 
agctcacgct 
cacgaacccc 
aacccggtaa 
gcgaggtatg 
agaagaacag 
ggtagctctt 
cagcagatta 
tctgacgctc 
aggatcttca 
ggtggfctacg 
tttcttccct 
cctttagggt 
gatggttcac 
tccacgttct 
ggctattctt 
ctgatttaac 
tttgcttata 
tctaaagtat 
ctatctcagc 
taactacgat 
cacgctcacc 
gaagtggtcc 
gagtaagtag 



ttctcaacaa 
ggataataaa 
agtacatatt 
tagtgcagtt 
ttctgacgaa 
gatttcagac 
aactacgatt 
caagcttggc 
cgtcggcatc 
gtggggaggc 
tgtgacataa 
tttaagtgta 
ctatggaact 
ctcagaagaa 
tccaaaaaag 
tttgagtcat 
ggaaaaagct 
taggcataac 
gtctgctatt 
ggttaataag 
catttgtaga 
ataaaatgaa 
aaagcaatag 
aaaagcaggc 
ccactagtga 
aaattgttat 
ctggggtgcc 
ccaggaaccg 
agcatcacaa 
accaggcgtt 
ccggatacct 
gtaggtatct 
ccgttcagcc 
gacacgactt 
taggcggtgc 
tatttggtat 
gatccggcaa 
cgcgcagaaa 
a 9 t 99aacga 
cctagatcct 
cgcagcgtga 
tcctttctcg 
tccgatfctag 
gtagtgggcc 
ttaatagtgg 
ttgatttata 
aaaaatttaa 
caatcttcct 
atatgagtaa 
gatctgtcta 
acgggagggc 
ggctccagat 
tgcaacttta 
ttcgccagtt 



gatctagaat 
catcatgtgg 
gaagcgtata 
tcgaatggac 
gtagatagaa 
tatgcaccag 
gaagaactca 
tgctgcctga 
caggaaacca 
acgafcggccg 
ttggacaaac 
taatgtgtta 
gatgaatggg 
atgccatcta 
aagagaaagg 
gctgtgttta 
gcactgctat 
agttataatc 
aataactatg 
gaatatttga 
ggttttactt 
tgcaattgtt 
catcacaaat 
ttcaatcctg 
gtcgtattac 
ccgctcacaa 
taatgagtga 
taaaaaggcc 
aaatcgacgc 
tccccctgga 
gtccgccttt 
cagttcggtg 
cgaccgctgc 
atcgccactg 
tacagagttc 
ctgcgctctg 
acaaaccacc 
aaaaggatct 
aaactcacgt 
tttacgcgcc 
ccgctacact 
ccacgttcgc 
tgctttacgg 
atcgccctga 
actcttgttc 
agggatttfcg 
cgcgaatttt 
gtttttgggg 
acttggtctg 
tttcgttcat 
ttaccatctg 
ttatcagcaa 
tccgcctcca 
aatagtttgc 



tagtagaagt 
gagcggcaat 
t aggacgagt 
aaaaggattt 
gtattcgagt 
attgttttgt 
ttccactcaa 
ggctggacga 
gcagcggcta 
ctttggtccg 
tacctacaga 
aactactgat 
agcagtggtg 
gtgatgatga 
tagaagaccc 
gtaatagaac 
acaagaaaat 
ataacatact 
ctcaaaaatt 
fcgtafcagtgc 
gctttaaaaa 
gttgttaact 
ttcacaaata 
cagagaagct 
gtagcttggc 
ttccacacaa 
gctaactcac 
gcgttgctgg 
tcaagtcaga 
agctccctcg 
ctcccttcgg 
taggtcgttc 
gccttatccg 
gcagcagcca 
ttgaagtggt 
ctgaagccag 
gctggtagcg 
caagaagatc 
taagggattt 
cfcgtagcggc 
tgccagcgcc 
tttccccgtc 
cacctcgacc 
tagacggttt 
caaactggaa 
ccgatttcgg 
aacaaaatat 
cttttctgat 
acagttacca 
ccatagttgc 
gccccagtgc 
taaaccagcc 
tccagtctat 
gcaacgttgt 



agcgacagag 
t eg t acgaaa 
aactgtttgt 
tgacacgatt 
ggtaagtcct 
gttaatagaa 
atatacccga 
cctcgcggag 
tccgcgcatc 
gatctttgtg 
gatttaaagc 
tctaattgtt 
gaatgccttt 
ggctactget 
caaggacttt 
tettgettge 
t a t ggaaaaa 
gttttttctt 
gtgtaccttt 
cttgactaga 
acctcccaca 
tgtttattgc 
aaga t c caga 
tggege cage 
gtaatcatgg 
catacgagcc 
attacatgtg 
cgtttttcca 
ggtggcgaaa 
tgcgctctcc 
gaagcgtggc 
gctccaagct 
gtaactatcg 
ctggtaacag 
ggectaacta 
ttaccttegg 
gtggtttttt 
ctttgatctt 
tggtcatgag 
geattaageg 
ctagcgcccg 
aagctctaaa 
ccaaaaaact 
ttcgcccttt 
caacactcaa 
cctattggtt 
taacgtttac 
tatcaacegg 
atgettaate 
ctgactcccc 
tgcaatgata 
ageeggaagg 
taattgttgc 
tgccattgcf 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 
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acaggcatcg 
cgatcaaggc 
cctccgatcg 
ctgcataatt 
tcaaccaagt 
acacgggata 
tcttcggggc 
actcgtgcac 
aaaacaggaa 
ctcatactct 
ggatacatat 
cgaaaagtgc 
aagcgaagcg 
cctcttcgct 
taacgccagg 
cactfcaaggc 
cgaagctagc 



tggtgtcacg 
gagttacatg 
ttgtcagaag 
ctcttactgt 
cattctgaga 
ataccgcgcc 
gaaaactctc 
ccaactgatc 
ggcaaaatgc 
tcctttttca 
ttgaatgtat 
cacctgacgt 
ccattcgcca 
attacgccag 
gttttcccag 
cttgactaga 
gaattcgtgg 



ctcgtcgttt 
atcccccatg 
t aagt tggcc 
catgccatcc 
atagtgtatg 
acatagcaga 
aaggatctta 
ttcagcatct 
cgcaaaaaag 
atattattga 
ttagaaaaat 
agttaacaaa 
ttcaggctgc 
ctggcgaaag 
tcacgacgtt 
gggtcgacgc 
ate 



ggtatggctt 
ttgtgcaaaa 
gcagtgttat 
gtaagatget 
cggcgaccga 
acttfcaaaag 
ccgctgttga 
tttactttca 
ggaataaggg 
agcatttatc 
aaacaaatag 
aaaaagcccg 
gcaactgttg 
ggggatgtgc 
gtaaaacgac 
atgcctgtac 



cattcagctc 
aagcggttag 
cactcatggt 
tttctgtgac 
gttgctcttg 
tgctcatcat 
gatccagttc 
ccagcgtttc 
egacaeggaa 
agggttattg 
gggttccgcg 
ccgaagcggg 
ggaagggega 
tgeaaggega 
ggccagtccg 
ateeggagae 



cggttcccaa 
ctccttcggt 
tatggcagca 
tggtgagtac 
cccggcgtca 
tggagaacgt 
gatgtaaccc 
tgggtgagca 
atgttgaata 
tctcatgagc 
cacatttccc 
ctttattacc 
tcggtgcggg 
ttaagttggg 
taatacgact 
gcgtcacggc 



3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4223 



<210> 122 
<211> 2686 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pUC18 Plasmid 



<400> 122 

tcgcgcgttt 

cagcttgtct 

ttggcgggtg 

accatatgeg 

attcgecatt 

tacgccagct 

tttcccagtc 

actctagagg 

gtgtgaaatt 

aaagcctggg 

gctttccagt 

agaggeggtt 

gtcgttcggc 

gaatcagggg 

cgtaaaaagg 

aaaaatcgac 

tttccccctg 

ctgtccgcct 

etcagttegg 

cccgaccgct 

ttatcgccac 

gctacagagt 

atctgcgctc 

aaacaaacca 

aaaaaaggat 

gaaaactcac 

cttttaaatt 

gacagttacc 

tccatagttg 

ggccccagtg 

ataaaccagc 

atccagtcta 

cgcaacgttg 

tcattcagct 

aaagcggtta 

tcactcatgg 

ttttctgtga 

agttgctctt 

gtgetcatea 

agatccagtt 



eggtgatgae 
gtaageggat 
teggggctgg 
gtgtgaaata 
caggctgcgc 
ggcgaaaggg 
acgacgttgt 
atccccgggt 
gttatccget 
gtgcctaatg 
egggaaaect 
tgcgtattgg 
tgeggegage 
ataaegcagg 
ccgcgttgct 
gctcaagtca 
gaagctccct 
ttctcccttc 
tgtaggtcgt 
gcgccttatc 
tggcagcagc 
tcttgaagtg 
tgetgaagee 
ccgctggtag 
ctcaagaaga 
gttaagggat 
aaaaatgaag 
aatgcttaat 
cctgactccc 
ctgeaatgat 
cagceggaag 
ttaattgttg 
ttgccattgc 
ccggttccca 
gctccttcgg 
ttatggcagc 
ctggtgagta 
gcccggcgtc 
ttggaaaacg 
cgatgtaacc 



ggtgaaaacc 
geegggagea 
cttaactatg 
ccgcacagat 
aactgttggg 
ggatgtgctg 
aaaacgaegg 
accgagctcg 
cacaattcca 
agtgagctaa 
gtcgtgccag 
gcgctcttcc 
ggtatcagct 
aaagaacatg 
ggcgtttttc 
gaggtggcga 
cgtgcgctct 
gggaagcgtg 
tcgctccaag 
eggtaactat 
cactggtaac 
gtggcctaac 
agttaccttc 
cggtggtttt 
tcctttgatc 
tttggtcatg 
ttttaaatca 
cagtgaggca 
cgtcgtgtag 
accgcgagac 
ggccgagcgc 
cegggaaget 
tacaggcatc 
acgatcaagg 
tcctccgatc 
actgeataat 
ctcaaccaag 
aataegggat 
ttcttcgggg 
cactcgtgca 



tctgacacat 
gacaagcccg 
eggcatcaga 
gegtaaggag 
aagggegat c 
caaggegatt 
ccagtgccaa 
aattcgtaat 
cacaacatac 
ctcacattaa 
ctgcattaat 
gcttcctcgc 
. cactcaaagg 
tgagcaaaag 
cataggctcc 
aacccgacag 
cctgttccga 
gcgctttctc 
ctgggctgtg 
cgtcttgagt 
aggattagca 
tacggctaca 
ggaaaaagag 
tttgtttgca 
ttttctaegg 
agattatcaa 
atctaaagta 
cctatctcag 
ataactacga 
ccacgctcac 
agaagtggtc 
agagtaagta 
gtggtgtcac 
cgagttacat 
gttgtcagaa 
tctcttactg 
tcattctgag 
aataccgcgc 
cgaaaactct 
cccaactgat 



gcagctcccg 
teagggegeg 
gcagattgta 
aaaatacege 
ggtgcgggcc 
aagttgggta 
gettgeatge 
catggtcata 
gagceggaag 
ttgcgttgcg 
gaateggeca 
tcactgactc 
eggtaatacg 
gecagcaaaa 
gcccccctga 
gactataaag 
ccctgccgct 
atagctcacg 
tgcacgaacc 
ccaacccggt 
gagegaggta 
ctagaaggac 
ttggtagctc 
agcagcagat 
ggtctgaege 
aaaggatctt 
tatatgagta 
cgatctgtct 
tacgggaggg 
cggctccaga 
ctgcaacttt 
gttcgccagt 
getegtegtt 
gatcccccat 
gtaagttggc 
tcatgccatc 
aatagtgtat 
cacatagcag 
caaggatctt 
cttcagcatc 



gagaeggtea 
teagegggtg 
ctgagagtgc 
atcaggcgcc 
tettegctat 
aegecagggt 
ctgeaggteg 
gctgtttcct 
cataaagtgt 
ctcactgccc 
aegegegggg 
gctgcgctcg 
gttatccaca 
ggecaggaac 
cgagcatcac 
ataccaggcg 
taceggatae 
ctgtaggtat 
ccccgttcag 
aagacacgac 
tgtaggcggt 
agtatttggt 
ttgatcegge 
tacgegcaga 
tcagtggaac 
cacctagatc 
aacttggtct 
atttegttea 
cttaccatct 
tttatcagca 
atccgcctcc 
taat agtttg 
tggtatggct 
gttgtgcaaa 
cgcagtgtta 
cgtaagatgc 
gcggcgaccg 
aactttaaaa 
accgctgttg 
ttttactttc 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 
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accagcgttt ctgggtgagc aaaaacagga aggcaaaatg ccgcaaaaaa gggaataagg 246 0 

gcgacacgga aatgttgaat actcatactc ttcctttttc aatattattg aagcatttat 252 0 

cagggttatt gtctcatgag cggatacafca tttgaatgta tttagaaaaa taaacaaata 258 0 

ggggttccgc gcacatttcc ccgaaaagtg ccacctgacg tctaagaaac cattattatc 264 0 

atgacattaa cctataaaaa taggcgtatc acgaggccct ttcgtc 268 6 

<210> 123 
<211> 8521 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pCXeGFPattB(6xHS4) 2 Plasmid 



<400> 123 

tacggggcgg 

gcccatatat 

ccaacgaccc 

ggactttcca 

at caagt gt a 

cctggcatta 

tattagtcat 

atctcccccc 

gcgatggggg 

gggcggggcg 

tccttttatg 

gggagtcgct 

ccggctctga 

gagctgtaat 

ccttaaaggg 
tgtgtgtgtg 
cgggcgcggc 
ggtgccccgc 
tgggggggtg 
cctccccgag 
gcggggctcg 
ccgcctcggg 
gtcgaggcgc 
gacttccttt 
tagcgggcgc 
eg t gcgt cgc 
acggctgcct 
gctctagagc 
acgtgctggt 
agggegagga 
acggccacaa 
ccctgaagtt 
ccctgaccta 
tcttcaagtc 
aeggcaacta 
tcgagctgaa 
acaactacaa 
tgaacttcaa 
agcagaacac 
cccagtccgc 
tcgtgaccgc 
etcaggtgea 
aataccactg 
agcatctgac 
ttttgtgtct 
agtatttggt 
ctataaagag 
aaagccttga 
acatccctaa 
ctcccagtca 
ggegtaatea 
caacatacga 



gggatccact 
ggagttccgc 
ccgcccattg 
ttgacgtcaa 
teatatgeca 
tgcccagtac 
cgctattacc 
cctccccacc 
eggggggggg 
aggeggagag 

gegaggegge 
gcgttgcctt 
ctgaccgcgt 
tagegcttgg 
etcegggagg 
cgtggggagc 
gcggggcttt 
ggtgcggggg 
agcagggggt 
ttgetgagea 
c cgtgccggg 
ccggggaggg 
ggcgagccgc 
gtcccaaatc 
gggcgaagcg 
cgcgccgccg 
teggggggga 
ctctgctaac 
tgttgtgctg 
gctgttcacc 
gttcagcgtg 
catctgcacc 
cggcgtgcag 
cgccatgccc 
caagacccgc 
gggcatcgac 
cagccacaac 
gatccgccac 
ccccatcggc 
cctgagcaaa 
cgccgggatc 
ggctgectat 
agatcttttt 
ttctggctaa 
ctcactcgga 
ttagagtttg 
gtcatcagta 
cttgaggtta 
aattttcctt 
tagctgtccc 
tggtcatagc 
geeggaagea 



agttattaat 
gttacataac 
aegtcaataa 
tgggtggact 
agtacgcccc 
atgaccttat 
atgggtcgag 
cccaattttg 
gggggegege 
gtgeggegge 
ggcggcggcg 
cgccccgtgc 
tactcccaca 
1 1 1 aatgacg 
gccctttgtg 
gccgcgtgcg 
gtgcgctccg 
ggctgegagg 

gtgggcgcgg 

cggcccggct 

cggggggtgg 
ctegggggag 
agccattgcc 
tggeggagee 
gtgcggcgcc 
tccccttctc 

eggggcaggg 

catgttcatg 
tctcatcatt 

ggggtggtgc 

teeggegagg 
accggcaagc 
tgcttcagcc 
gaaggctacg 
gccgaggtga 
ttcaaggagg 
gtctatatca 
aacatcgagg 
gacggccccg 
gaccccaacg 
actctcggca 
cagaaggtgg 
ccctctgcca 
taaaggaaat 
aggacatatg 
gcaacatatg 
tatgaaacag 
gatttttttt 
acatgtttta 
tcttctctta 
tgtttcctgt 
taaagtgtaa 



agtaatcaat 
ttacggtaaa 
tgacgtatgt 
atttaeggta 
etattgaegt 
gggactttcc 
gtgagcccca 
tatttattta 
gec aggeggg 
agecaatcag 
gecctataaa 
cccgctccgc 
ggtgagcggg 
getegtttet 
egggggggag 
gcccgcgctg 
cgtgtgcgcg 
ggaacaaagg 
eggteggget 
t egggt gegg 
cggcaggtgg 
gggcgcggcg 
ttttatggta 
gaaatctggg 
ggcaggaagg 
catctccagc 
eggggttegg 
ccttcttctt 
ttggcaaaga 
ccatcctggt 
gegagggega 
tgcccgtgcc 
gctaccccga 
tecaggageg 
ag 1 1 cgaggg 
aeggcaacat 
tggccgacaa 
aeggcagegt 
tgctgctgcc 
agaagegega 
tggacgagct 
tggctggtgt 
aaaattatgg 
ttattttcat 
ggagggcaaa 
ecatatgetg 
ccccctgctg 
atattttgtt 
etagecagat 
tgaagatccc 
gtgaaattgt 
agcctggggt 



taeggggtea 
tggcccgcct 
tcccatagta 
aactgcccac 
caatgaeggt 
tacttggcag 
cgttctgctt 
ttttttaatt 
gcggggcggg 
agcggcgcgc 
aagcgaagcg 
gccgcctcgc 
egggaegge c 
tttctgtggc 
eggctegggg 
cccggcggct 
aggggagege 
ctgcgtgcgg 
gtaacccccc 
ggctccgtgc 
gggtgccggg 
gccccggagc 
ategtgegag 
aggcgccgcc 
aaatgggcgg 

cteggggctg 

cttctggcgt 
tttcctacag 
attcgccacc 
cgagctggac 
tgccacctac 
ctggcccacc 
ccacatgaag 
caccatcttc 
cgacaccctg 
cctggggcac 
gcagaagaac 
gcagctcgcc 
cgacaaccac 
tcacatggtc 
gtacaagtaa 
ggccaatgcc 
ggacatcatg 
tgcaatagtg 
tcatttaaaa 
getgecatga 
tccattcctt 
ttgtgttatt 
ttttcctcct 
tcgacctgca 
tatccgctca 
gectaatgag 



ttagttcata 
ggctgaccgc 
aegecaatag 
ttggcagtac 
aaatggcccg 
tacatctacg 
cactctcccc 
attttgtgca 
gcgaggggcg 
tccgaaagtt 
cgcggcgggc 
gccgcccgcc 
cttctcctcc 
tgcgtgaaag 
ggtgcgtgcg 
gtgagcgctg 
ggceggggge 
ggtgtgtgcg 
cctgcacccc 

ggggcgtggc 
eggggegggg 
gccggcggct 
agggegcagg 
gcaccccctc 
ggagggcett 
ccgcaggggg 
gtgaccggcg 
ctcctgggca 
atggtgagca 
ggcgacgtaa 
ggcaagctga 
ctcgtgacca 
cagcacgact 
ttcaaggacg 
gtgaaccgea 
aagctggagt 
ggcatcaagg 
gaccactacc 
tacctgagca 
ctgctggagt 
gaattcactc 
ctggctcaca 
aagccccttg 
tgttggaatt 
catcagaatg 
acaaaggtgg 
attccataga 
tttttcttta 
ctcctgacta 
gcccaagctt 
caattccaca 
tgagctaact 
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cacattaatt 
gatccgcatc 
ctaactccgc 
gcagaggccg 
ggaggctagt 
agaagcgttc 
cgctgccggc 
ctgctgcccc 
gtccccgtga 
cgagaagcgt 
cacgctgccg 
cgctgctgcc 
ctgtccccgt 
agcgagaagc 
cgcacgctgc 
ctcgctgctg 
ggctgtcccc 
gc agcgagaa 
cccgcacgct 
ggctcgctgc 
ggggctgtcc 
gagcagcgag 
tccccgcacg 
gcggctcgct 
ggggggctgt 

aagagcagcg 
tgtccccgca 
gggcggctcg 

ggggggggct 

tttttatact 
attgcagctt 
tttttttcac 
tggatccgct 
gctcttccgc 
tatcagctca 
agaacatgtg 
cgtttttcca 
ggtggcgaaa 
tgcgctctcc 
gaagcgtggc 
gctccaagct 
gtaactatcg 
ctggtaacag 
ggcctaacta 
ttaccttcgg 
gtggtttttt 
ctttgatctt 
tggtcatgag 
ttaaatcaat 
gtgaggcacc 
tcgtgtagat 
cgcgagaccc 
ccgagcgcag 
gggaagctag 
caggcatcgt 
gatcaaggcg 
ctccgatcgt 
tgcataattc 
caaccaagtc 
tacgggataa 
cttcggggcg 
ctcgtgcacc 
aaacaggaag 
tcatactctt 
gatacatatt 
gaaaagtgcc 
ggatccgctc 



gcgttgcgct 
tcaattagtc 
ccagttccgc 
aggccgcctc 
ggatcccccg 
agaggaaagc 
tcggggatgc 
ctagcggggg 
gcggatccgc 
tcagaggaaa 
gctcggggat 
ccctagcggg 
gagcggatcc 
gttcagagga 
cggctcgggg 
ccccctagcg 
gtgagcggat 
gcgttcagag 
gccggctcgg 
tgccccctag 
ccgtgagcgg 
aagcgttcag 
ctgccggctc 
gctgccccct 
ccccgtgagc 
agaagcgttc 
cgctgccggc 
ctgctgcccc 
gtccccgtga 
aacttgagcg 
ataatggtta 
tgcattctag 
gcattaatga 
ttcctcgctc 
ctcaaaggcg 
agcaaaaggc 
taggctccgc 
cccgacagga 
tgttccgacc 
gctttctcaa 
gggctgtgtg 
tcttgagtcc 
gattagcaga 
cggctacact 
aaaaagag 1 1 
tgtttgcaag 
ttctacgggg 
attatcaaaa 
ctaaagtata 
tatctcagcg 
aactacgata 
acgctcaccg 
aagtggtcct 
agtaagtagt 
ggtgtcacgc 
agttacatga 
tgtcagaagt 
tcttactgtc 
attctgagaa 
taccgcgcca 
aaaactctca 
caactgatct 
gcaaaatgcc 
cctttttcaa 
tgaatgtatt 
acctggtcga 
acggggacag 



cactgcccgc 
agcaaccata 
ccattctccg 
ggcctctgag 
ccccgtatcc 
gatcccgtgc 

ggggggagcg 

agggacgtaa 
ggccccgtat 
gcgatcccgt 
gcggggggag 
ggagggacgt 
gcggccccgt 
aagcgatccc 
atgcgggggg 

ggggagggac 

ccgcggcccc 
gaaagcga t c 
ggatgcgggg 
cgggggaggg 
atccgcggcc 
aggaaagcga 
ggggatgcgg 
agcgggggag 
ggat ccgcgg 
agaggaaagc 
tcggggatgc 
ctagcggggg 
gcggatccgc 
aaatcaagct 
caaataaagc 
ttgtggtttg 
atcggccaac 
actgactcgc 
gtaatacggt 
cagcaaaagg 
ccccctgacg 
ctataaagat 
ctgccgctta 
tgctcacgct 
cacgaacccc 
aacccggtaa 
gcgaggtatg 
agaaggacag 
ggtagctctt 
cagcagatta 
tctgacgctc 
aggatcttca 
tatgagtaaa 
atctgtctat 
cgggagggct 
gctccagatt 
gcaactttat 
tcgccagtta 
tcgtcgtttg 
tcccccatgt 
aagttggccg 
atgccatccg 
tagtgtatgc 
catagcagaa 
aggat c t t ac 
tcagcatctt 
gcaaaaaagg 
tattattgaa 
tagaaaaata 
cggtatcgat 
ccccccccca 



tttccagtcg 
gtcccgcccc 
ccccatggct 
ctattccaga 
cccaggtgtc 
caccttcccc 
ccggaccgga 
ttacatccct 
cccccaggtg 
gccaccttcc 
cgccggaccg 
aattacatcc 
atcccccagg 
gtgccacctt 
agcgccggac 
gtaattacat 
gtatccccca 
ccgtgccacc 
ggagcgccgg 
acgtaattac 
ccgtatcccc 
tcccgtgcca 
ggggagcgcc 

ggacgtaatt 
ccccgtatcc 
gatcccgtgc 

ggggggagcg 

agggacgtaa 
ggggctgcag 
cctaggcttt 
aatagcatca 
tccaaactca 
gcgcggggag 
tgcgctcggt 
tatccacaga 
ccaggaaccg 
agcatcacaa 
accaggcgtt 
ccggatacct 
gtaggtatct 
ccgttcagcc 
gacacgactt 
taggcggtgc 
tatttggtat 
gatccggcaa 
cgcgcagaaa 
agtggaacga 
cctagatcct 
cttggtctga 
ttcgttcatc 
taccatctgg 
tatcagcaat 
ccgcctccat 
atagtttgcg 
gtatggcttc 
tgtgcaaaaa 
cagtgttatc 
taagatgctt 
ggcgaccgag 
ctttaaaagt 
cgctgttgag 
ttactttcac 
gaataagggc 
gcatttatca 
aacaaatagg 
aagcttgata 
aagcccccag 



ggaaacctgt 
taactccgcc 
gactaatttt 
ag t ag t gagg 
tgcaggctca 
gtgcccgggc 
gcggagcccc 

gggggctttg 

tctgcaggct 
ccgtgcccgg 
gagcggagcc 
ctgggggctt 
tgtctgcagg 
ccccgtgccc 
cggagcggag 
ccctgggggc 
ggtgtctgca 
ttccccgtgc 
accggagcgg 
atccctgggg 
caggtgtctg 
ccttccccgt 
ggaccggagc 
acatccctgg 
cccaggtgtc 
caccttcccc 
ccggaccgga 
ttacatccct 
gaattcgatt 
tgcaaaaagc 
caaatttcac 
tcaatgtatc 
aggcggtttg 
cgttcggctg 
atcaggggat 
taaaaaggcc 
aaatcgacgc 
tccccctgga 
gtccgccttt 
cagttcggtg 
cgaccgctgc 
atcgccactg 
tacagagttc 
ctgcgctctg 
acaaaccacc 
aaaaggatct 
aaactcacgt 
tttaaattaa 
cagttaccaa 
catagttgcc 
ccccagtgct 
aaaccagcca 
ccagtctatt 
caacgttgtt 
attcagctcc 
agcggttagc 
actcatggtt 
ttctgtgact 
ttgctcttgc 
gctcatcatt 
atccagttcg 
cagcgtttct 
gacacggaaa 
gggttattgt 
ggttccgcgc 
tcgaattcct 
ggatgtaatt 



cgtgccagcg 

catcccgccc 

ttttatttat 

aggctttttt 

aagagcagcg 

tgtccccgca 

gggcggct c g . 

ggggggggct 

caaagagcag 
gctgtccccg 
ccgggcggct 
tggggggggg 
c t caaagagc 
gggctgtccc 
ccccgggcgg 
tttggggggg 
ggctcaaaga 
ccgggctgtc 
agccccgggc 
gctttggggg 
caggctcaaa 
gcccgggctg 
ggagccccgg 
gggctttggg 
tgcaggctca 
gtgcccgggc 
gcggagcccc 

gggggctttg 

gaagcctgct 
taacttgttt 
aaataaagca 
ttatcatgtc 
cgtattgggc 
cggcgagcgg 
aacgcaggaa 
gcgttgctgg 
tcaagtcaga 
agctccctcg 
ctcccttcgg 
taggtcgttc 
gccttatccg 
gcagcagcca 
ttgaagtggt 
ctgaagccag 
gctggtagcg 
caagaagatc 
taagggattt 
aaatgaagtt 
tgcttaatca 
tgactccccg 
gcaatgatac 
gccggaaggg 
aattgttgcc 
gccattgcta 
ggttcccaac 
tccttcggtc 
atggcagcac 
ggtgagtact 
ccggcgtcaa 
ggaaaacgtt 
atgtaaccca 

gggtgagcaa 

tgttgaatac 
ctcatgagcg 
acatttcccc 
gcagccccgc 
acgtccctcc 
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cccgctaggg 
atccccgagc 
tttcctctga 
gcggatccgc 
cccccgctag 
gcatccccga 
gctttcctct 
ccgcggatcc 
ctcccccgct 
ccgcatcccc 
tcgctttcct 
ggccgcggat 
ccctcccccg 
ccccgcatcc 
gatcgctttc 
ggggccgcgg 
gtccctcccc 
ccccccgcat 
gggatcgctt 
acggggccgc 
acgtccctcc 
ctccccccgc 
acgggatcgc 
a 



ggc agcagcg 
cggcagcgtg 
acgcttctcg 
tcacggggac 
ggggcagcag 
gccggcagcg 
gaacgcttct 
gctcacgggg 
agggggcagc 
gagccggcag 
ctgaacgctt 
ccgctcacgg 
ctagggggca 
ccgagccggc 
ctctgaacgc 
atccgctcac 
cgctaggggg 
ccccgagccg 
tcctctgaac 
ggatccgctc 
cccgctaggg 
atccccgagc 
tttcctctga 



agccgcccgg 
cggggacagc 
ctgctctttg 
agcccccccc 
cgagccgccc 
tgcggggaca 
cgctgctctt 
acagcccccc 
agcgagccgc 
cgtgcgggga 
ctcgctgctc 
ggacagcccc 
gcagcgagcc 
agcgtgcggg 
ttctcgctgc 
ggggacagcc 
cagcagcgag 
gcagcgtgcg 
gcttctcgct 
acggggacag 
ggcagcagcg 
cggcagcgtg 
acgcttctcg 



ggctccgctc 
ccgggcacgg 
agcctgcaga 
caaagccccc 
ggggctccgc 
gcccgggcac 
tgagcctgca 
cccaaagccc 
ccggggctcc 
cagcccgggc 
tttgagcctg 
cccccaaagc 
gcccggggct 
ga c age c egg 
tetttgagee 
cccccccaaa 
ccgcccgggg 
gggacagccc 
gctctttgag 
ccccccccca 
agccgcccgg 

cggggacagc 

ctgctctttg 



cggtccggcg 
ggaaggtggc 
cacctggggg 
agggatgtaa 
tccggtccgg 
ggggaaggtg 
gacacctggg 
ccagggatgt 
gctccggtcc 
aeggggaagg 
cagacacctg 
ccccagggat 
ccgctccggt 
geaeggggaa 
tgcagacacc 
gcccccaggg 
ctccgctccg 
gggcacgggg 
cctgcagaca 
aagcccccag 
ggctccgctc 
ccgggcacgg 
agcctgcaga 



ctccccccgc 
acgggatcgc 
ataeggggee 
ttacgtccct 
cgctcccccc 
geaegggate 
ggatacgggg 
aattaegtec 
ggcgctcccc 
tggcacggga 
ggggataegg 
gtaattacgt 
ccggcgctcc 
ggtggcacgg 
tgggggatac 
atgtaattac 
gtccggcgct 
aaggtggcac 
cctgggggat 
ggatgtaatt 
cggtccggcg 
ggaaggtggc 
cacctggggg 



7200 
7260 
7320 
7380 
7440 
7500 
7560 
7620 
7680 
7740 
7800 
7860 
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<400> 124 

cagt tgeegg 

gtcatggccg 

tacagctcgt 

tcctggaccg 

tccacgaagt 

tcgcgcgcgg 

caagttagta 

agccccgcgg 

gtccctcccc 

ccccccgcat 

gggatcgctt 

acggggccgc 

acgtccctcc 

ctccccccgc 

acgggatcgc 

ataeggggee 

ttacgtccct 

cgctcccccc 

geaegggate 

ggatacgggg 

aattaegtec 

ggcgctcccc 

tggcacggga 

ggggataegg 

gtaattacgt 

ccggcgctcc 

ggtggcacgg 
tgggggatac 
atgtaattac 
gtccggcgct 
aaggtggcac 
cctgggggat 
tagttcatag 



c eggg t egcg 
gcccggaggc 
ccaggccgcg 
cgctgatgaa 
cccgggagaa 
tgagcacegg 
taaaaaagca 
atccgctcac 
cgctaggggg 
ccccgagccg 
tcctctgaac 
ggatccgctc 
cccgctaggg 
atccccgagc 
tttcctctga 
gcggatccgc 
cccccgctag 
gcatccccga 
gctttcctct 
ccgcggatcc 
ctcccccgct 
ccgcatcccc 
tcgctttcct 
ggccgcggat 
ccctcccccg 
ccccgcatcc 
gatcgctttc 

ggggccgcgg 

gtccctcccc 
ccccccgcat 
gggatcgctt 
aeggggeggg 
cccatatatg 



cagggegaac 
gtcccggaag 
cacccacacc 
cagggtcacg 
cccgagccgg 
aacggcactg 
ggcttcaatc 
ggggacagcc 
cagcagcgag 
gcagcgtgcg 
gcttctcgct 
acggggacag 
ggcagcagcg 
cggcagcgtg 
acgcttctcg 
tcacggggac 
ggggcagcag 

gccggcagcg 
gaaegcttet 
gctcacgggg 
agggggcagc 
gagccggcag 
ctgaacgctt 
ccgctcacgg 
ctagggggca 
ccgagccggc 
ctctgaacgc 
atccgctcac 
cgctaggggg 
ccccgagccg 
tcctctgaac 
ggatccacta 
gagttccgcg 



tcccgccccc 
ttcgtggaca 
caggecaggg 
tcgtcccgga 
teggtccaga 
gtcaacttgg 
ctgcagagaa 
cccccccaaa 
ccgcccgggg 
gggacagccc 
gctctttgag 
ccccccccca 
agccgcccgg 
cggggacagc 
ctgctctttg 
agcccccccc 
cgagccgccc 
tgcggggaca 
cgctgctctt 
acagcccccc 
agcgagccgc 
cgtgcgggga 
ctcgctgctc 
ggacagcccc 
gcagcgagcc 
agcgtgcggg 
ttctcgctgc 

ggggacagcc 

cagcagcgag 
gcagcgtgcg 
gcttctcgct 
gttattaata 
ttacataact 



acggctgctc 
cgacctccga 
tgttgtccgg 
ccacaccggc 
actcgaccgc 
ccatggatcc 
gcttgatatc 
gcccccaggg 
ctccgctccg 
gggcacgggg 
cctgcagaca 
aagcccccag 
ggctccgctc 
ccgggcacgg 
agcctgcaga 
caaagccccc 
ggggctccgc 
gcccgggcac 
tgagcctgca 
cccaaagccc 
ccggggctcc 
cagcccgggc 
tttgagcctg 
cccccaaagc 
gcccggggct 
gacagcccgg 
tetttgagee 
cccccccaaa 
ccgcccgggg 
gggacagccc 
gctctttgag 
gtaatcaatt 
tacggtaaat 



gccgatctcg 
ccactcggcg 
caccacctgg 
gaagt eg t cc 
tccggcgacg 
agatttcget 
gaattcctgc 
atgtaattac 
gtccggcgct 
aaggtggcac 
cctgggggat 
ggatgtaatt 
cggtccggcg 
ggaaggtggc 
cacctggggg 
agggatgtaa 
tccggtccgg 
ggggaaggtg 
gacacctggg 
ccagggatgt 
gctccggtcc 
aeggggaagg 
cagacacctg 
ccccagggat 
ccgctccggt 
geaeggggaa 
tgcagacacc 
gcccccaggg 
ctccgctccg 
gggcacgggg 
cctgcagaca 
aeggggtcat 
ggcccgcctg 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 



WO 02/097059 



PCT/US02/17452 



-95- 



gctgaccgcc 
cgccaatagg 
tggcagtaca 
aatggcccgc 
acatctacgt 
actctcccca 
ttttgtgcag 
cgaggggcgg 
ccgaaagttt 
gcggcgggcg 
ccgcccgccc 
ttctcctccg 
gcgtgaaagc 
gtgcgtgcgt 
tgagcgctgc 
gccgggggcg 
gtgtgtgcgt 
ctgcaccccc 
gggcgtggcg 
ggggcggggc 
ccggcggctg 

gggcgcaggg 

caccccctct 
gagggccttc 
cgcaggggga 
tgaccggcgg 
ccctgctgtc 
acagccgagt 
gctgtgctga 
tctatgcctg 
ccctgctgtc 
gggagcccct 
tgcttcgggc 
ctccactccg 
tcctccgggg 
gtacaagtaa 
ggccaatgcc 
ggacatcatg 
tgcaatagtg 
tcatttaaaa 
gctgccatga 
tccattcctt 
ttgtgttatt 
ttttcctcct 
tcgacctgca 
atcccccagg 
gtgccacctt 
agcgccggac 
gtaattacat 
gtatccccca 
ccgtgccacc 
ggagcgccgg 
acgtaattac 
ccgtatcccc 
tcccgtgcca 
ggggagcgcc 
ggacgtaatt 
ccccgtatcc 
gatcccgtgc 

ggggggagcg 
agggacgtaa 
ggccccgtat 
gcgatcccgt 
gcggggggag 
ggagggacgt 
gcggccccgt 
aagcgatccc 



caacgacccc 
gactttccat 
tcaagtgtat 
ctggcattat 
attagtcatc 
tctccccccc 
cgatgggggc 
ggcggggcga 
ccttttatgg 
ggagtcgctg 
cggctctgac 
ggctgtaatt 
cttaaagggc 
gtgtgtgtgc 
gggcgcggcg 
gtgccccgcg 

gggggggtga 

ctccccgagt 
cggggctcgc 
cgcctcgggc 
tcgaggcgcg 
acttcctttg 
agcgggcgcg 
gtgcgtcgcc 
cggctgcctt 
ctctagaatg 
gctccctctg 
cctggagagg 
acactgcagc 
gaagaggatg 
ggaagctgtc 
gcagctgcat 
tctgggagcc 
aacaatcact 
aaagctgaag 
gaattcactc 
ctggctcaca 
aagccccttg 
tgttggaatt 
catcagaatg 
acaaaggtgg 
attccataga 
tttttcttta 
ctcctgacta 
gcccaagctt 
tgtctgcagg 
ccccgtgccc 
cggagcggag 
ccctgggggc 
ggtgtctgca 
ttccccgtgc 
accggagcgg 
atccctgggg 
caggtgtctg 
ccttccccgt 
ggaccggagc 
acatccctgg 
cccaggtgtc 
caccttcccc 
ccggaccgga 
ttacatccct 
cccccaggtg 
gccaccttcc 
cgccggaccg 
aattacatcc 
atcccccagg 
gtgccacctt 



cgcccattga 
tgacgtcaat 
catatgccaa 
gcccagtaca 
gctattacca 
ctccccaccc 

gggggggggg 

ggcggagagg 
cgaggcggcg 
cgttgccttc 
tgaccgcgtt 
agcgcttggt 
tccgggaggg 

gtggggagcg 
cggggctttg 

gtgcgggggg 
gcagggggtg 
tgctgagcac 
cgtgccgggc 
cggggagggc 
gcgagccgca 
tcccaaatct 
ggcgaagcgg 
gcgccgccgt 
cgggggggac 
ggggtgcacg 
ggcctcccag 
tacctcttgg 
ttgaatgaga 
gaggt cgggc 
ctgcggggcc 
gtggataaag 
cagaaggaag 
gctgacactt 
ctgtacacag 
ctcaggtgca 
aataccactg 
agcatctgac 
ttttgtgtct 
agtatttggt 
ctataaagag 
aaagccttga 
acatccctaa 
ctcccagtca 
gcatgcctgc 
ctcaaagagc 
gggctgtccc 
ccccgggcgg 
tttggggggg 
ggctcaaaga 
ccgggctgtc 
agcc c cgggc 
gctttggggg 
caggctcaaa 
gcccgggctg 
ggagccccgg 
gggctttggg 
tgcaggctca 
gtgcccgggc 
gcggagcccc 

gggggctttg 

tctgcaggct 
ccgtgcccgg 
gagcggagcc 
ctgggggctt 
tgtctgcagg 
ccccgtgccc 



cgtcaataat 

gggtggacta 

gtacgccccc 
tgaccttatg 
tgggtcgagg 
ccaattttgt 
ggggcgcgcg 
tgcggcggca 
gcggcggcgg 
gccccgtgcc 
actcccacag 
ttaatgacgg 
ccctttgtgc 
ccgcgtgcgg 
tgcgctccgc 
gctgcgaggg 
tgggcgcggc 
ggcccggctt 

ggggggtggc 

tcgggggagg 
gccattgcct 
ggcggagccg 
tgcggcgccg 
ccccttctcc 

ggggcagggc 

aatgtcctgc 
tcctgggcgc 
aggccaagga 
atatcactgt 
agcaggccgt 
aggccctgtt 
ccgtcagtgg 
ccatctcccc 
tccgcaaact 
gggaggcctg 
ggctgcctat 
agatcttttt 
ttctggctaa 
ctcactcgga 
ttagagtttg 
gtcatcagta 
cttgaggtta 
aattttcctt 
tagctgtccc 
aggtcgactc 
agcgagaagc 
cgcacgctgc 
ctcgctgctg 
ggctgtcccc 
gcagcgagaa 
cccgcacgct 
ggctcgctgc 
ggggctgtcc 
gagcagcgag 
tccccgcacg 
gcggctcgct 
ggggggctgt 
aagagcagcg 
tgtccccgca 
gggcggctcg 

ggggggggct 

caaagagcag 
gctgtccccg 
ccgggcggct 
tggggggggg 
ctcaaagagc 
gggctgtccc 



gacgtatgtt 
tttacggtaa 
tattgacgtc 
ggactttcct 
tgagccccac 
atttatttat 
ccaggcgggg 
gccaatcaga 
ccctataaaa 
ccgctccgcg 
gtgagcgggc 
ctcgtttctt 

gggggggagc 

cccgcgctgc 
gtgtgcgcga 
gaacaaaggc 
ggt cgggc tg 
cgggtgcggg 
ggcaggtggg 
ggcgcggcgg 
tttatggtaa 
aaatctggga 
gcaggaagga 
atctccagcc 
ggggttcggc 
ctggctgtgg 
cccaccacgc 
ggccgagaat 
cccagacacc 
agaagtctgg 
ggtcaactct 
ccttcgcagc 
tccagatgcg 
cttccgagtc 
caggacaggg 
cagaaggtgg 
ccctctgcca 
taaaggaaat 
aggacatatg 
gcaacatatg 
tatgaaacag 
gatttttttt 
acatgtttta 
tcttctctta 
tagtggatcc 
g t tcagagga 
cggctcgggg 
ccccctagcg 
gtgagcggat 
gcgttcagag 
gccggctcgg 
tgccccctag 
ccgtgagcgg 
aagcgttcag 
ctgccggctc 
gctgccccct 
ccccgtgagc 
agaagcgttc 
cgctgccggc 
ctgctgcccc 
gtccccgtga 
cgagaagcgt 
cacgctgccg 
cgctgctgcc 
ctgtccccgt 
agcgagaagc 
cgcacgctgc 



cccatagtaa 
actgcccact 
aatgacggta 
acttggcagt 
gttctgcttc 
tttttaatta 
cggggcgggg 
gcggcgcgct 
agcgaagcgc 
ccgcctcgcg 
gggacggccc 
ttctgtggct 
ggctcggggg 
ccggcggctg 

ggggagcgcg 

tgcgtgcggg 
taaccccccc 
gctccgtgcg 
ggfcgc cgggc 
ccccggagcg 
tcgtgcgaga 
ggcgccgccg 
aatgggcggg 
tcggggctgc 
ttctggcgtg 
cttctcctgt 
ctcatctcftg 
atcacgacgg 
aaagttaatt 
cagggcctgg 
tcccagccgt 
ctcaccactc 
gcctcagctg 
tactccaatt 
gacagatgac 
tggctggtgt 
aaaattatgg 
ttattttcat 
ggagggcaaa 
ccatatgctg 
ccccctgctg 
atattttgtt 
ctagccagat 
tgaagatccc 
cccgccccgt 
aagcgatccc 
atgcgggggg 
ggggagggac 
ccgcggcccc 
gaaagcgatc 
ggatgcgggg 
cgggggaggg 
atccgcggcc 
aggaaagcga 

ggggatgcgg 

agcgggggag 
ggatccgcgg 
agaggaaagc 
tcggggatgc 
ctagcggggg 
gcggatccgc 
t cagaggaaa 
gctcggggat 
ccctagcggg 
gagcggatcc 
gttcagagga 
cggctcgggg 



2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4O20 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 
5100 
5160 
5220 
5280 
5340 
5400 
5460 
5520 
5580 
5640 
5700 
5760 
5820 
5880 
5940 
6000 
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atgcgggggg 
ggggagggac 
ccgcggggct 
ccgctcacaa 
taatgagtga 
aacctgtcgt 
attgggcgct 
cgagcggtat 
gcaggaaaga 
ttgctggcgt 
agtcagaggt 
tccctcgtgc 
ccttcgggaa 
gtcgttcgct 
ttatccggta 
gcagccactg 
aagtggtggc 
aagccagtta 
ggtagcggtg 
gaagatcctt 
gggattttgg 
tgaagtttta 
ttaatcagtg 
ctccccgtcg 
atgataccgc 
ggaagggecg 
tgttgccggg 
attgctacag 
tcccaacgat 
ttcggtcctc 
gcagcactgc 
gagtactcaa 
gcgtcaatac 
aaacgttcfct 
taacccactc 
tgagcaaaaa 
tgaatactca 
atgagcggat 
tttccccgaa 
attaccaagc 
tgcgggcctc 
gtfcgggtaac 
acgactcact 
gatgagtfctg 
tgtgatgcta 
g'aagaactcc 
gattccgaag 
tcagtcctgc 



agcgccggac 
gtaattacat 
gcaggaatfcc 
ttccacacaa 
gctaactcac 
gccagctgca 
cfctccgcttc 
cagctcactc 
acatgtgagc 
ttttccatag 
ggcgaaaccc 
gctctcctgt 
gcgtggcgct 
ccaagctggg 
actatcgtct 
gtaacaggat 
ctaactacgg 
ccttcggaaa 
gtttttttgt 
tgatcttttc 
tcatgagatt 
aatcaatcta 
aggcacctat 
tgtagataac 
gagacccacg 
agcgcagaag 
aagctagagt 
gcatcgtggt 
caaggcgagt 
cgatcgttgt 
ataattctct 
ccaagtcatt 
gggataat ac 
cggggcgaaa 
gtgcacccaa 
caggaaggca 
tactcttcct 
acatatttga 
aagtgccacc 
gaagcgccat 
ttcgctatta 
gccagggttt 
taaggccttg 
gacaaaccac 
ttgctttatt 
agcatgagafc 
cccaaccttt 
tcctcggcca 



cggagcggag 
ccctgggggc 
gtaatcatgg 
catacgagcc 
attaattgcg 
ttaatgaatc 
ctcgctcact 
aaaggcggta 
aaaaggccag 
gctccgcccc 
gacaggacta 
tccgaccctg 
ttctcatagc 
cfcgtgtgcac 
tgagtccaac 
tagcagagcg 
ctacactaga 
aagagfctggt 
ttgcaagcag 
tacggggtct 
atcaaaaagg 
aagtatatat 
ctcagcgatc 
tacgatacgg 
ctcaccggct 
tggtcctgca 
aagtagttcg 
gtcacgctcg 
tacatgatcc 
cagaagtaag 
tactgtcatg 
ctgagaatag 
cgcgccacat 
actctcaagg 
ctgatcttca 
aaatgccgca 
ttttcaatat 
atgtatttag 
tgacgtagtt 
tcgccattca 
cgccagctgg 
tcccagtcac 
actagagggt 
aactagaatg 
tgfcaaccatt 
ccccgcgctg 
catagaaggc 
cgaagtgcac 



ccccgggcgg 
tttggggggg 
tcatagctgt 
ggaagcataa 
ttgcgctcac 
ggccaacgcg 
gactcgcfcgc 
atacggtfcat 
caaaaggcca 
cctgacgagc 
taaagatacc 
ccgcttaccg 
tcacgctgta 
gaaccccccg 
ccggtaagac 
aggtatgtag 
aggacagtat 
agctcttgat 
cagattacgc 
gacgctcagt 
atcttcacct 
gagtaaactt 
tgtctatttc 
gagggcttac 
ccagatttat 
actttatccg 
ccagttaata 
tcgtttggta 
cccatgttgt 
ttggccgcag 
ccatccgtaa 
tgtatgcggc 
agcagaactt 
atcfctaccgc 
gcatctttta 
aaaaagggaa 
tattgaagca 
aaaaataaac 
aacaaaaaaa 
ggctgcgcaa 
cgaaaggggg 
gacgttgtaa 
cgacggtata 
cagtgaaaaa 
ataagctgca 
gaggatcatc 
ggcggtggaa 

g 



ctcgctgctg 
ggctgtcccc 
fcfccctgtgtg 
agtgtaaagc 
tgcccgcttt 

cgggg^gagg 

gctcggtcgt 
ccacagaatc 
ggaaccgtaa 
atcacaaaaa 
aggcgtttcc 
gatacctgtc 
ggtatctcag 
ttcagcccga 
acgacttatc 
gcggtgctac 
ttggtatctg 
ccggcaaaca 
gcagaaaaaa 
ggaacgaaaa 
agatcctttt 
ggtcfcgacag 
gttcatccat 
catctggccc 
cagcaataaa 
cctccatcca 
gtttgcgcaa 
tggcttcatt 
gcaaaaaagc 
fcgttatcacfc 
gatgcfcttfcc 
gaccgagfctg 
taaaagtgct 
tgt t gaga t c 
ctttcaccag 
taagggcgac 
tttatcaggg 
aaataggggt 
agcccgccga 
ctgttgggaa 
atgtgctgca 
aacgacggcc 
cagacatgat 
aatgctttat 
ataaacaagt 
cagccggcgt 
tcgaaatctc 



ccccctagcg 
gtgagcggat 
aaattgttat 
ctggggtgcc 
ccagtcggga 
cggfcfcfcgcgt 
tcggctgcgg 

aggggataac 
aaaggccgcg 
tegacgctca 
ccctggaagc 
cgcctttctc 
ttcggtgtag 
ccgctgcgcc 
gccactggca 
agagttcttg 
cgctctgctg 
aaccaccgct 
aggatctcaa 
ctcacgttaa 
aaattaaaaa 
ttaccaatgc 
agttgcctga 
cagtgctgca 
ccagccagcc 
gtctattaat 
cgfctgttgcc 
cagctccggt 
ggttagctcc 
catggttatg 
tgtgactggt 
ctcttgcccg 
catcattgga 
cagttcgatg 
cgt 1 1 c t ggg 
acggaaatgt 
ttattgtctc 
tccgcgcaca 
agcgggcttt 
gggcgatcgg 
aggcgattaa 
agtccgtaat 
aagatacatt 
ttgtgaaatt 

tggggtgggc 

cceggaaaac 
gtagcacgtg 



<2X0> 125 
<211> 10474 
<212> DNA 

<213> Artificial Sequence 
<220> 

<2 23> pl8genEPO Plasmid 



6060 
6120 
6180 
6240 
6300 
6360 
6420 
6480 
6540 
6600 
6660 
6720 
6780 
6840 
6900 
6960 
7020 
7080 
7140 
7200 
7260 
7320 
7380 
7440 
7500 
7560 
7620 
7680 
7740 
7800 
7860 
7920 
7980 
8040 
810O 
8160 
8220 
8280 
8340 
8400 
8460 
8520 
8580 
8640 
8700 
8760 
8820 
8851 



<400> 125 

cagttgccgg 

gtcatggccg 

tacagctcgt 

tcctggaccg 

tccacgaagt 

tcgcgcgcgg 

caagttagta 

agccccgcgg 

gtccctcccc 



ccgggtcgcg 
gcccggaggc 
ccaggccgcg 
cgctgatgaa 
cc cgggagaa 
tgagcaccgg 
taaaaaagca 
atccgctcac 
eg c t aggggg 



cagggegaac 
gtcccggaag 
cacccacacc 
cagggtcacg 
cccgagccgg 
aacggcactg 
ggcttcaatc 
ggggacagee 
cagcagegag 



tcccgccccc 
ttcgtggaca 
caggecaggg 
tcgtcccgga 
teggtccaga 
gtcaacttgg 
ctgcagagaa 
cccccccaaa 
ccgcccgggg 



acggctgctc 
cgacctccga 
tgttgfcccgg 
ccacaccggc 
actcgaccgc 
ccatggatcc 
gcttgatatc 
gcccccaggg 
ctccgctccg 



gccgatctcg 6 0 
ccactcggcg 120 
caccacctgg 180 
gaagtegtec 24 0 
tccggcgacg 3 00 
agatttcget 3 60 
gaattcctgc 42 0 
atgtaattac 4 80 
gtccggcgct 540 
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ccccccgcat 
gggat cgct t 
acggggccgc 
acgtccctcc 
ctccccccgc 
acgggat cgc 
atacggggcc 
ttacgtccct 
cgctcccccc 
gcacgggatc 
ggatacgggg 
aattacgtcc 
ggcgctcccc 
tggcacggga 
ggggatacgg 
gtaattacgt 
ccggcgctcc 
ggtggcacgg 
tgggggatac 
atgtaattac 
gtccggcgct 
aaggtggcac 
cctgggggat 
tagttcatag 
gctgaccgcc 
cgccaatagg 
tggcagtaca 
aatggcccgc 
acatctacgt 
actctcccca 
ttttgtgcag 
cgaggggcgg 
ccgaaagttt 
gcggcgggcg 
ccgcccgccc 
ttctcctccg 
gcgtgaaagc 
gtgcgtgcgt 
tgagcgctgc 
gccgggggcg 
gtgtgtgcgt 
ctgcaccccc 
gggcgtggcg 
ggggcggggc 
ccggcggctg 
gggcgcaggg 
caccccctct 
gagggccttc 
cgcaggggga 
tgaccggcgg 
tcgccctttc 
ccgggtccct 
tcaaggaccg 
tgccagcggg 
acagtttggg 
ctgataagct 
gtcacaccag 
gcacacggca 
ggggacagga 
gccacccttc 
ctggctgtgg 
cccaccacgc 
ggccgagaat 
gcttcaggga 
agctagacac 
ctaggcaagg 
ggacccttga 



ccccgagccg 
tcctctgaac 
ggatccgctc 
cc cgc taggg 
atccccgagc 
tttcctctga 
gcggatccgc 
cccccgctag 
gcatccccga 
gctttcctct 
ccgcggatcc 
ctcccccgct 
ccgcatcccc 
tcgctttcct 
ggccgcggat 
ccctcccccg 
ccccgcatcc 
gatcgctttc 
ggggccgcgg 
gtccctcccc 
ccccccgcat 
gggatcgctt 
acggggcggg 
cccatatatg 
caacgacccc 
gactttccat 
tcaagtgtat 
ctggcattat 
attagtcatc 
tctccccccc 
cgatgggggc 
ggcggggcga 
ccttttatgg 
ggagtcgctg 
cggctctgac 
ggctgtaatt 
cttaaagggc 
gtgtgtgtgc 
gggcgcggcg 
gtgccccgcg 

gggggggtga 

ctccccgagt 
cggggctcgc 
cgcctcgggc 
tcgaggcgcg 
acttcctttg 
agcgggcgcg 
gtgcgtcgcc 
cggctgcctt 
ctctagatgc 
tagaatgggg 
gtttgagcgg 
gcgacttgtc 
gacttggggg 
ggttgagggg 
gataacctgg 
gattgaagtt 
gcaggattga 
aggacgagct 
tccctccccg 
cttctcctgt 
ctcatctgtg 
atcacggtga 
actcctccca 
tgccccccta 
agcaaagcca 
ctccccgggc 



gcagcgtgcg 
gcttctcgct 
acggggacag 
ggcagcagcg 
cggcagcgtg 
acgcttctcg 
tcacggggac 

ggggcagcag 

gccggcagcg 
gaacgcttct 
gctcacgggg 
agggggcagc 
gagccggcag 
ctgaacgctt 
ccgctcacgg 
ctagggggca 
ccgagccggc 
ctctgaacgc 
atccgctcac 
cgctaggggg 
ccccgagccg 
tcctctgaac 
ggatccacta 
gagttccgcg 
cgcccattga 
tgacgtcaat 
catatgccaa 
gcccagtaca 
gctattacca 
ctccccaccc 

gggggggggg 
ggcggagagg 
cgaggcggcg 
cgttgccttc 
tgaccgcgtt 
agcgcttggt 
tccgggaggg 

gtggggagcg 
cggggctttg 
gtgcgggggg 
gcagggggtg 
tgctgagcac 
cgtgccgggc 
cggggagggc 
gcgagccgca 
tcccaaatct 
ggcgaagcgg 
gcgccgccgt 
cgggggggac 
atgctcgagc 
gtgcacggtg 
ggatttagcg 
aaggaccccg 
agtccttggg 
aagaaggttt 
gcgctggagc 
tggccggaga 
atgaaggcca 
ggggcagaga 
cctgactctc 
ccctgctgtc 
acagccgagt 
gaccccttcc 
gatccaggaa 
cataagaata 
gcagatccta 
tgtgtgcatt 



gggacagccc 
gctctttgag 
ccccccccca 
agccgcccgg 
cggggacagc 
ctgctctttg 
agcccccccc 
cgagccgccc 
tgcggggaca 
cgctgctctt 
acagcccccc 
agcgagccgc 
cgtgcgggga 
ctcgctgctc 
ggacagcccc 
gcagcgagcc 
agcgtgcggg 
ttctcgctgc 
ggggacagcc 
cagcagcgag 
gcagcgtgcg 
gcttctcgct 
gttattaata 
ttacataact 
cgtcaataat 
gggt ggac t a 
gtacgccccc 
tgaccttatg 
tgggtcgagg 
ccaattttgt 

ggggcgcgcg 

tgcggcggca 
gcggcggcgg 
gccccgtgcc 
actcccacag 
ttaatgacgg 
ccctttgtgc 
ccgcgtgcgg 
tgcgctccgc 
gctgcgaggg 
tgggcgcggc 
ggcccggctt 

ggggggtggc 

tcgggggagg 
gccattgcct 

ggcggagccg 

tgcggcgccg 
ccccttctcc 
ggggcagggc 
ggccgccagt 
agtactcgcg 
ccccggctat 
gaagggggag 
gatggcaaaa 

gggggttctg 

caccacttat 
agtggatgct 
gggaggcagc 
cgtggggatg 
agcctggcta 
gctccctctg 
cctggagagg 
ccagcacatt 
cctggcactt 
agtctggtgg 
cggcctgtgg 
tcagacgggc 



gggcacgggg 
cctgcagaca 
aagcccccag 
ggctccgctc 
ccgggcacgg 
agcctgcaga 
caaagccccc 

ggggctccgc 

gcccgggcac 
tgagcctgca 
cccaaagccc 
ccggggctcc 
cagcccgggc 
tttgagcctg 
cccccaaagc 
gcccggggct 
gacagcccgg 
tctttgagcc 
cccccccaaa 
ccgcccgggg 
gggacagccc 
gctctttgag 
gtaatcaatt 
tacggtaaat 
gacgtatgtt 
tttacggtaa 
tattgacgtc 
ggactttcct 
tgagccccac 
atttatttat 
cc aggcgggg 
gccaatcaga 
ccctataaaa 
ccgctccgcg 
gtgagcgggc 
ctcgtttctt 

gggggggagc 

cccgcgctgc 
gtgtgcgcga 
gaacaaaggc 
ggtcgggctg 
cgggtgcggg 
ggcaggtggg 
ggcgcggcgg 
tttatggtaa 
aaatctggga 
gcaggaagga 
atctccagcc 
ggggttcggc 
gtgatggata 
ggctgggcgc 
tggccaggag 

gggggtgggg 

acctgacctg 
ctgtgccagt 
ctgccagagg 
ggtagctggg 
acctgagtgc 
aaggaagctg 
tctgttctag 
ggcctcccag 
tacctcttgg 
ccacagaact 
ggtttggggt 
ccccaaacca 
gccagggcca 
tgtgctgaac 



aaggtggcac 
cctgggggat 
ggatgtaatt 
cggt ccggcg 
ggaaggtggc 
cacctggggg 
agggatgtaa 
tccggtccgg 
ggggaaggtg 
gacacctggg 
ccagggatgt 
gctccggtcc 
acggggaagg 
cagacacctg 
ccccagggat 
ccgctccggt 
gcacggggaa 
tgcagacacc 
gcccccaggg 
ctccgctccg 
gggcacgggg 
cctgcagaca 
acggggtcat 
ggcccgcctg 
cccatagtaa 
actgcccact 
aatgacggta 
acttggcagt 
gttctgcttc 
tttttaatta 
cggggcgggg 
gcggcgcgct 
agcgaagcgc 
ccgcctcgcg 
gggacggccc 
ttctgtggct 
ggctcggggg 
ccggcggctg 
ggggagcgcg 
tgcgtgcggg 
taaccccccc 
gctccgtgcg 
ggtgccgggc 
ccccggagcg 
tcgtgcgaga 
ggcgccgccg 
aatgggcggg 
tcggggctgc 
ttctggcgtg 
tctgcagaat 
tcccgcccgc 

gtggctgggt 
tgcctccacg 
tgaaggggac 
ggagaggaag 
ggaagcctct 

ggtiggggtgt 
ttgcatggtt 
tccttccaca 
aatgtcctgc 
t cc tgggcgc 
aggccaagga 
cacgctcagg 
ggagttggga 
tacctggaaa 
gagccttcag 
actgcagctt 
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gaatgagaat 
ggtgagttcc 
tttggatgaa 
cctgggcgca 
cttgagccct 
acatttaaaa 
gctgaggcgg 
ccactgcact 
gaaaaataat 
ttcattcatt 
cttggggctg 
cactccctgt 
gaagctgtcc 
cagctgcatg 
ctgggagccc 
aagggtcttg 
gcgacctcct 
ctgctccact 
atttcctccg 
gacgtacaag 
tgtggccaat 
tggggacatc 
cattgcaata 
aaatcattta 
ctggctgcca 
ctgtccattc 
gttttgtgtt 
gatttttcct 
ccctcgacct 
cgtatccccc 
cccgtgccac 
gggagcgccg 
gacgtaatta 
cccgtatccc 
atcccgtgcc 
gggggagcgc 
gggacgtaat 
gccccgtatc 
cgatcccgtg 
cggggggagc 
gagggacgta 
cggccccgta 
agcgatcccg 
tgcgggggga 
gggagggacg 
cgcggccccg 
aaagcgatcc 
gatgcggggg 
gggggaggga 
tccgcggccc 
ggaaagcgat 
gggatgcggg 
gcgggggagg 
gatccgcggg 
tatccgctca 
gcctaatgag 
ggaaacctgt 
cgtattgggc 
cggcgagcgg 
aacgcaggaa 
gcgttgctgg 
tcaagtcaga 
agctccctcg 
ctcccttcgg 
taggtcgttc 
gccttatccg 
gcagcagcca 



atcactgtcc 
tttttttttt 
agggagaatg 
gaggctcacg 
ggag 1 1 1 c ag 
aaattagtca 
gaggatcgct 
ccagcctcag 
gagggctgta 
cattcattca 
ctgaggggca 
aggtcgggca 
tgcggggcca 
tggataaagc 
aggtgag t ag 
ctaaggagta 
gttttctcct 
ccgaacaatc 
gggaaagctg 
taagaattca 
gcccfcggctc 
atgaagcccc 
gtgtgttgga 
aaacatcaga 
tgaacaaagg 
cttattccat 
atttttttct 
cctctcctga 
gcagcccaag 
aggtgtctgc 
cttccccgtg 
gaccggagcg 
catccctggg 
ccaggtgtct 
accttccccg 
egg acc ggag 
tacatccctg 
ccccaggtgt 
ccaccttccc 
gccggaccgg 
attacatccc 
tcccccaggt 
tgccaccttc 
gcgccggacc 
taattacatc 
tatcccccag 
cgtgccacct 
gagegcegga 
cgtaattaca 
cgtatccccc 
cccgtgccac 
gggagcgccg 
gacgtaatta 
getgeaggaa 
caattccaca 
tgagctaact 
cgtgccagct 
gctcttccgc 
tatcagctca 
agaacatgtg 
cgtttttcca 

ggtggcgaaa 
tgcgctctcc 
gaagcgtggc 
gctccaagct 
gtaactatcg 
ctggtaacag 



cagacaccaa 
tttttccttt 
atcgagggaa 
tctataatcc 
accaacctag 
ggtgaagtgg 
tgageccagg 
tgacagagtg 
tggaatacat 
acaagtctta 
ggagggagag 
geaggcegta 
ggccctgttg 
cgtcagtggc 
gageggacac 
caggaactgt 
tggcagaagg 
actgetgaca 
aagctgtaca 
ctcctcaggt 
acaaatacca 
ttgagcatct 
attttttgtg 
atgagtattt 
tggctataaa 
agaaaagect 
ttaacatccc 
ctactcccag 
cttgcatgcc 
aggctcaaag 
cccgggctgt 
gagccccggg 
ggctttgggg 
gcaggctcaa 
tgcccgggct 
cggagccccg 
ggggctttgg 
ctgcaggctc 
cgtgcccggg 
agcggagccc 

tgggggcttt 

gtctgeagge 
cccgtgcccg 
ggageggage 
cctgggggct 
gtgtctgcag 
tccccgtgcc 
ccggagcgga 
tccctggggg 
aggtgtctgc 
cttccccgtg 
gaccggagcg 
catccctggg 
ttegtaatea 
caacatacga 
cacattaatt 
gcattaatga 
ttcctcgctc 
etcaaaggeg 
agcaaaaggc 
taggctccgc 
cccgacagga 
tgttccgacc 
gctttctcat 
gggctgtgtg 
tcttgagtcc 
gattagcaga 



agttaatttc 
cttttggaga 
aggtaaaatg 
caggctgaga 
gcagcatagt 
tgcatggtgg 
aatttgaggc 
aggccctgtc 
tcattattca 
ttgeataect 
ggtgacatgg 
gaagtctggc 
gtcaactctt 
cttcgcagcc 
ttctgcttgc 
ccgtattcct 
aagccatctc 
ctttccgcaa 
c aggggaggc 
gcaggctgcc 
ctgagatctt 
gacttctggc 
tctctcactc 
ggtttagagt 
gaggtcatca 
tgacttgagg 
taaaattttc 
tcatagctgt 
tgeaggtega 
ageagegaga 
ccccgcacgc 
cggctcgctg 
gggggc t g t c 
agagcagega 
gtccccgcac 
ggcggctcgc 

gggggggctg 

aaagagcagc 
ctgtccccgc 
cgggcggctc 

gggggggggc 

tcaaagagca 
ggctgtcccc 
cccgggcggc 
ttgggggggg 
gctcaaagag 
cgggctgtcc 
gccccgggcg 
ctttgggggg 
aggctcaaag 
cccgggctgt 
gagccccggg 
ggctttgggg 
tggtcatagc 
geeggaagea 
gcgttgcgct 
atcggccaac 
actgactcgc 
gtaatacggt 
cagcaaaagg 
ccccctgacg 
ctataaagat 
ctgccgctta 
agctcacgct 
cacgaacccc 
aacceggtaa 
gcgaggtatg 



tatgcctgga 
atctcatttg 
gagcagcaga 
tggecgagat 
gagatccccc 
t agt cccaga 
tgcagtgagc 
tcaaaaaaga 
ttcactcact 
tctgtttgct 
gtcagctgac 
agggectgge 
cccagccgtg 
tcaccactct 
cctttctgta 
tccctttctg 
ccctccagat 
actcttccga 
ctgeaggaca 
tatcagaagg 
tttccctctg 
taataaagga 
ggaaggac at 
ttggcaacat 
gtatatgaaa 
ttagattttt 
cttacatgtt 
ccctcttctc 
ctctagtgga 
agegttcaga 
tgccggctcg 
ctgcccccta 
cccgtgagcg 
gaagcgttca 
gctgccggct 
tgctgccccc 
tccccgtgag 
gagaagegt t 
acgctgccgg 
gctgctgccc 
tgtccccgtg 
gcgagaagcg 
gcacgctgcc 
tcgctgctgc 
gctgtccccg 
c agegagaag 
ccgcacgctg 
gctcgctgct 
gggctgtccc 
ageagegaga 
ccccgcacgc 
cggctcgctg 

gggggctgtc 

tgtttcctgt 
taaagtgtaa 
cactgcccgc 
gegeggggag 
tgcgctcggt 
tatccacaga 
ccaggaaccg 
agcatcacaa 
accaggegtt 
ccggatacct 
gtaggtatct 
ccgttcagcc 
gacacgactt 
taggcggtgc 



agaggatgga 
egagectgat 
gatgaggctg 
gggagaattg 
atctctacaa 
tatttggaag 
tgtgatcaca 
aaagaaaaaa 
cactcactca 
cagcttggtg 
tcccagagtc 
cctgctgtcg 
ggagcccctg 
getteggget 
agaaggggag 
tggcactgea 
gcggcctcag 
gtctactcca 
ggggacagat 
tggtggctgg 
ccaaaaatta 
aatttatttt 
atgggagggc 
atgccatatg 
cagccccctg 
tttatatttt 
ttactageca 
ttatgaagat 
tcccccgccc 
ggaaagcgat 
gggatgcggg 
gegggggagg 
gatccgcggc 
gaggaaagcg 
eggggatgeg 
tageggggga 
cggatccgcg 
cagaggaaag 
cteggggatg 
ectagegggg 
ageggatccg 
ttcagaggaa 
ggc t egggga 
cccctagcgg 
tgageggate 
cgttcagagg 
ccggctcggg 
gccccctagc 
cgtgagcgga 
agegttcaga 
tgccggctcg 
ctgcccccta 
cccgtgagcg 
gtgaaattgt 
agcctggggt 
tttccagtcg 
aggcggtttg 
cgttcggctg 
atcaggggat 
taaaaaggee 
aaatcgaege 
tccccctgga 
gtccgccttt 
cagttcggtg 
cgaccgctgc 
atcgccactg 
tacagagttc 
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ttgaagtggt 
ctgaagccag 
gctggtagcg 
caagaagatc 
taagggattt 
aaatgaagtt 
tgcttaatca 
tgactccccg 
gcaatgatac 
gccggaaggg 
aattgttgcc 
gccattgcta 
ggttcccaac 
tccttcggtc 
atggcagcac 
ggt gag tact 
ccggcgtcaa 
ggaaaacgtt 
atgtaaccca 
gggtgagcaa 
tgttgaatac 
ctcatgagcg 
acatttcccc 
tttattacca 
cggtgcgggc 
taagttgggt 
aatacgactc 
attgatgagt 
atttgtgatg 
ggcgaagaac 
aacgattccg 
gtgtcagtcc 



ggcctaacta 
ttaccttcgg 
gtggtttttt 
ctttgatctt 
tggtcatgag 
ttaaatcaat 
gtgaggcacc 
tcgtgtagat 
cgcgagaccc 
ccgagcgcag 
gggaagctag 
caggca t cgt 
ga t c aaggcg 
ctccgatcgt 
tgcataattc 
caaccaagtc 
tacgggataa 
cttcggggcg 
ctcgtgcacc 
aaacaggaag 
tcatactctt 
gatacatatt 
gaaaagtgcc 
agcgaagcgc 
ctcttcgcta 
aacgccaggg 
acttaaggcc 
ttggacaaac 
ctattgcttt 
tccagcatga 
aagcccaacc 
tgctcctcgg 



cggctacacfc 
aaaaagag 1 1 
tgtttgcaag 
ttctacgggg 
attatcaaaa 
ctaaagtata 
tatctcagcg 
aactacgata 
acgctcaccg 
aagtggtcct 
agtaagtagt 
ggtgtcacgc 
agttacatga 
t gt cagaagt 
tcttactgtc 
attctgagaa 
taccgcgcca 
aaaactctca 
caactgatct 
gcaaaatgcc 
cctttttcaa 
t gaatgtat t 
acctgacgta 
cattcgccat 
ttacgccagc 
ttttcccagt 
ttgactagag 
cacaactaga 
atttgtaacc 
gatccccgcg 
tttcatagaa 
ccacgaagtg 



agaaggacag 
ggtagctctt 
cagcagatta 
tctgacgctc 
aggatcttca 
tatgagtaaa 
atctgtctat 
cgggagggct 
gctccagatt 
gcaactttat 
tcgccagtta 
tcgtcgtttg 
tcccccatgt 
aagttggccg 
atgccatccg 
tagtgtatgc 
catagcagaa 
aggatcttac 
tcagcatctt 
gcaaaaaagg 
tattattgaa 
tagaaaaata 
gttaacaaaa 
tcaggcfcgcg 
tggcgaaagg 
cacgacgttg 
ggtcgacggt 
atgcagtgaa 
attataagct 
ctggaggatc 
ggcggcggtg 
cacg 



tatttggtat 
gatccggcaa 
cgcgcagaaa 
agtggaacga 
cctagatcct 
cttggtctga 
ttcgttcatc 
taccatctgg 
tatcagcaat 
ccgcctccat 
atagtttgcg 
gtatggcttc 
tgtgcaaaaa 
cagtgttatc 
taagatgctt 
ggcgaccgag 
ctttaaaagt 
cgctgttgag 
ttactttcac 
gaataagggc 
gcatttatca 
aacaaatagg 
aaaagcccgc 
caactgttgg 
gggatgtgct 
taaaacgacg 
atacagacat 
aaaaatgctt 
gcaataaaca 
atccagccgg 
gaatcgaaat 



ctgcgctctg 
acaaaccacc 
aaaaggatct 
aaactcacgt 
tttaaattaa 
cagttaccaa 
catagttgcc 
ccccagtgct 
aaaccagcca 
ccagtctatt 
caacgttgtt 
at t cage tec 
ageggttage 
actcatggtt 
ttctgtgact 
ttgctcttgc 
gctcatcatt 
atccagttcg 
cagegtttet 
gaeaeggaaa 
gggttattgt 
ggttccgcgc 
egaageggge 
gaagggegat 
geaaggegat 
gccagtccgt 
gataagatac 
tatttgtgaa 
agttggggtg 
cgtcccggaa 
ctcgtagcac 



<210> 126 
<211> 6119 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> p!8attB2eoeGFP Plasmid 



<400> 126 

cagttgeegg 

gtcatggccg 

tacagctcgt 

tcctggaccg 

tccacgaagt 

tcgcgcgcgg 

caagttagta 

gatcttcata 

ctggctagta 

caaaatataa 

gcagggggct 

gcatatggca 

tgccctccca 

gaaaataaat 

ataatttttg 

accagccacc 

ctcgtccatg 

gcgcttctcg 

cagcagcacg 

gctgccgtcc 

gtcggccatg 

gttgccgtcc 

ctcgaacttc 

ctcctggacg 

ggggtagegg 



ccgggtcgcg 
gcccggaggc 
ccaggccgcg 
cgctgatgaa 
cccgggagaa 
tgagcacegg 
taaaaaagca 
agaga agagg 
aaacatgtaa 
aaaaaatcta 
gtttcatata 
tatgttgcca 
tatgtccttc 
ttcctttatt 
gcagagggaa 
accttctgat 
ccgagagtga 

ttggggtctt 

gggccgtcgc 
tcgatgttgt 
atatagacgt 
tccttgaagt 
acctcggcgc 
tagecttegg 
ctgaagcact 



cagggegaac 
gtcccggaag 
cacccacacc 
cagggtcacg 
cccgagccgg 
aacggcactg 
ggcttcaatc 
gacagctatg 
ggaaaatttt 
acctcaagtc 
ctgatgacct 
aactctaaac 
cgagtgagag 
agecagaagt 
aaagatctca 
aggcagectg 
tcccggcggc 
tgetcaggge 
cgatgggggfc 
ggeggatett 
tgtggctgtt 
cgatgccctt 
gggtcttgta 
gcatggcgga 
gcacgccgta 



tcccgccccc 
ttcgtgcjaca 
caggecaggg 
tcgtcccgga 
teggtccaga 
gtcaacttgg 
ctgcagagaa 
actgggagta 
agggatgtta 
aaggcttttc 
etttatagee 
caaatactca 
acacaaaaaa 
cagatgetea 
gtggtatttg 
cacctgagga 
ggtcacgaac 
ggactgggtg 
gttctgctgg 
gaagttcacc 
gtagttgtac 
cagctcgatg 
gttgeegteg 
cttgaagaag 
ggtcagggtg 



acggctgctc 
cgacctccga 
tgttgtccgg 
ccacaccggc 
actcgaccgc 
ccatggatcc 
gcttgggctg 
gtcaggagag 
aagaaaaaaa 
tatggaataa 
acctttgttc 
ttctgatgtt 
ttccaacaca 
aggggcttca 
tgagccaggg 
gtgaattctt 
tccagcagga 
ctcaggtagt 
tagtggtcgg 
ttgatgccgt 
tccagcttgt 
eggtt caeca 
tccttgaaga 
tegtgetget 
gtcacgaggg 



gccgatctcg 
ccactcggcg 
caccacctgg 
gaagtegtec 
tccggcgacg 
agatttcget 
caggtcgagg 
gaggaaaaat 
taacacaaaa 
ggaatggaca 
atggcagcca 
ttaaatgatt 
etattgeaat 
tgatgtcccc 
cattggccac 
acttgtacag 
ccatgtgatc 
ggttgtcggg 
cgagctgcac 
tettctgett 
gccccaggat 
gggtgtcgcc 
agatggtgcg 
tcatgtggtc 
tgggccaggg 
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cacgggcagc tfcgccggtgg tgcagatgaa 
gccctcgccc tcgccggaca cgctgaactt 
caggatgggc accaccccgg tgaacagctc 
tttgccaaaa tgatgagaca gcacaacaac 
gaagaaggca tgaacatggt tagcagaggc 
gaaccccgcc ctgccccgfcc ccccccgaag 
tggagatgga gaaggggacg gcggcgcggc 
ttcctgccgg cgccgcaccg cttcgcccgc 
cagatttcgg ctccgccaga tttgggacaa 
ccataaaagg caatggctgc ggctcgccgc 
ccgcgcccct cccccgagcc ctccccggcc 
acctgccgcc accccccgcc cggcacggcg 
gcacccgaag ccgggccgtg ctcagcaact 
cccgaccgcc gcgcccacac cccctgctca 
tttgttcccc tcgcagcccc cccgcaccgc 
cgcacacgcg gagcgcacaa agccccgcgc 
gcgcgggccg cacgcggcgc tccccacgca 
cccccccgca caaagggccc tcccggagcc 
aaacgagccg tcattaaacc aagcgctaat 
cgctcacctg tgggagtaac gcggtcagtc 
ggagcggggc acggggcgaa ggcaacgcag 
tatagggccg ccgccgccgc cgcctcgcca 
gattggctgc cgccgcacct ctccgcctcg 
cgcctggcgc gcgccccccc cccccccgcc 
aataaataca aaattggggg tggggagggg 
gggctcacct cgacccatgg taatagcgat 
aaagtcccat aaggtcatgt actgggcata 
gtcaataggg ggcgtacttg gcatatgata 
ccgtaaatag tccacccatt gacgtcaatg 
atacgtcatt attgacgtca atgggcgggg 
taccgtaagt v tatgtaacgc ggaactccat 
tgattactafc taataactag aggatccccg 
atagctgttt cctgtgtgaa attgttatcc 
aagcataaag tgtaaagcct ggggtgccta 
gcgctcactg cccgctttcc agtcgggaaa 
ccaacgcgcg gggagaggcg gtttgcgtat 
ctcgctgcgc tcggtcgttc ggctgcggcg 
acggttatcc acagaatcag gggataacgc 
aaaggccagg aaccgcaaaa aggccgcgtt 
tgacgagcat cacaaaaatc gacgctcaag 
aagataccag gcgtttcccc ctggaagctc 
gcttaccgga tacctgtccg cctttctccc 
acgcfcgtagg tatctcagtt cggfcgtaggt 
accccccgtt cagcccgacc gctgcgcctt 
ggtaagacac gacttatcgc cactggcagc 
gtatgtaggc ggtgctacag agttcttgaa 
gacagtattt ggtatctgcg ctctgctgaa 
ctcttgatcc ggcaaacaaa ccaccgctgg 
gattacgcgc agaaaaaaag gatctcaaga 
cgctcagtgg aacgaaaact cacgttaagg 
cttcacctag atccttttaa attaaaaatg 
gtaaacttgg tctgacagtt accaatgctt 
tctatttcgt tcatccatag tfcgccfcgact 
gggcttacca tctggcccca gtgctgcaat 
agatttatca gcaataaacc agccagccgg 
tttatccgcc tccatccagt ctafctaattg 
agttaatagt ttgcgcaacg ttgttgccat 
gtttggtatg gcttcattca gctccggttc 
catgttgtgc aaaaaagcgg ttagctcctt 
ggccgcagtg ttatcactca tggttatggc 
atccgtaaga tgcttttctg tgactggtga 
tatgcggcga ccgagttgct cttgcccggc 
cagaacttta aaagtgctca tcattggaaa 
cttaccgctg ttgagatcca gttcgatgta 
atcttttact tfccaccagcg tfctctgggtg 
aaagggaata agggcgacac ggaaatgttg 
ttgaagcatt tatcagggtt attgtctcat 



-100- 



cttcagggtc agcttgccgt aggtggcatc 15 6 0 
gtggccgfcfct acgtcgccgt ccagctcgac 162 0 
ctcgcccttg ctcaccatgg tggcgaattc 1680 
cagcacgttg cccaggagct gtaggaaaaa 1740 
tctagagccg ccggtcacac gccagaagcc 18 00 
gcagccgtcc ccctgcggca gccccgaggc 1860 
gacgcacgaa ggccctcccc gcccatttcc 1920 
gcccgctaga gggggtgcgg cggcgccfccc 1980 
aggaagtccc tgcgccctct cgcacgatta 2 040 
gcctcgacag ccgccggcgc tccggggccg 210 0 
cgaggcggcc ccgccccgcc cggcaccccc 216 0 
agccccgcgc cacgccccgc acggagcccc 222 0 
cggggagggg ggtgcagggg ggggttacag 22 80 
cccccccacg cacacacccc gcacgcagcc 2340 
ggggcaccgc ccccggccgc gctcccctcg 24 0 0 
cgcgcccgca gcgctcacag ccgccgggca 24 6 0 
cacacacacg cacgcacccc ccgagccgct 252 0 
ctttaaggct ttcacgcagc cacagaaaag 2580 
tacagcccgg aggagaaggg ccgtcccgcc 2640 
agagccgggg cgggcggcgc gaggcggcgc 270 0 
cgactcccgc ccgccgcgcg cttcgctttt 2760 
taaaaggaaa ctttcggagc gcgccgctct 2 82 0 
ccccgccccg cccctcgccc cgccccgccc 2 880 
cccafccgccg cacaaaataa ttaaaaaata 2 94 0 
gg9gagatgg ggagagtgaa gcagaacgtg 3 000 
gactaatacg tagatgtact gccaagtagg 3 060 
atgccaggcg ggccatttac cgfccafcfcgac 312 0 
cacttgatgt actgccaagt gggcagttta 3180 
gaaagtccct attggcgtta cfcafcgggaac 324 0 
gtcgttgggc ggtcagccag gcgggccatt 3 3 00 
atatgggcta tgaactaatg accccgtaat 33 60 
ggtaccgagc tcgaattcgt aatcatggtc 342 0 
gctcacaatt ccacacaaca tacgagccgg 3480 
atgagtgagc taactcacat taattgcgtt 3540 
cctgtcgtgc cagctgcatt aatgaatcgg 3 6 00 
tgggcgctct tccgcttcct cgctcactga 3 660 
agcggtatca gctcactcaa aggcggtaat 3 720 
aggaaagaac atgtgagcaa aaggccagca 3780 
gctggcgttt ttccataggc tccgcccccc 3840 
tcagaggtgg cgaaacccga caggactata 3 900 
cctcgtgcgc tctcctgttc cgaccctgcc 3960 
ttcgggaagc gtggcgcttfc ctcatagctc 4 02 0 
cgttcgctcc aagctgggct gtgtgcacga 40 80 
atccggtaac tatcgtcttg agtccaaccc 4140 
agccactggt aacaggatta gcagagcgag 42 0 0 
gtggtggcct aactacggct acactagaag 42 60 
gccagttacc ttcggaaaaa gagttggtag 4320 
tagcggtggt ttttttgttt gcaagcagca 43 80 
agatcctttg atcttttcta cggggcctga 444 0 
gattttggtc atgagattat caaaaaggat 4500 
aagttttaaa tcaatctaaa gtatatatga 4560 
aatcagtgag gcacctatct cagcgatctg 462 0 
ccccgtcgtg tagataacta cgatacggga 46 80 
gataccgcga gacccacgct caccggctcc 4740 
aagggccgag cgcagaagtg gtcctgcaac 48 00 
ttgccgggaa gctagagtaa gtagttcgcc 4 860 
tgctacaggc atcgtggtgt cacgctcgtc 4 920 
ccaacgatca aggcgagtta catgatcccc 4980 
cggtcctccg atcgttgtca gaagtaagtt 5040 
agcactgcat aattctctta ctgtcatgcc 510 0 
gtactcaacc aagtcattct gagaatagtg 5160 
gtcaatacgg gataataccg cgccacatag 5220 
acgttcttcg gggcgaaaac tctcaaggat 5280 
acccactcgt gcacccaact gatcttcagc 5340 
agcaaaaaca ggaaggcaaa atgccgcaaa 54 0 0 
aatactcata ctcttccttt ttcaatatta 5460 
gagcggatac atatttgaat gtatttagaa 5520 



WO 02/097059 



PCT/US02/17452 



-101- 



aaafcaaacaa ataggggttc cgcgcacatt tccccgaaaa gtgccacctg acgtagttaa 55 80 
caaaaaaaag cccgccgaag cgggctttat taccaagcga agcgccattc gccattcagg 564 O 
ctgcgcaact gttgggaagg gcgatcggtg cgggcctctt cgctattacg ccagctggcg 57 OO 
aaagggggat gtgctgcaag gcgattaagt tgggtaacgc cagggttttc ccagtcacga 5760 
cgttgtaaaa cgacggccag tccgtaatac gactcactta aggccttgac fcagagggfccg 5820 
acggtataca gacatgataa gatacattga tgagtttgga caaaccacaa ctagaatgca 5 8 80 
gtgaaaaaaa tgctttattt gtgaaatttg tgafcgctatt gctttatttg taaccattat 5940 
aagctgcaat aaacaagttg gggtgggcga agaactccag catgagatcc ccgcgctgga 60 0 0 
ggatcatcca gccggcgt cc cggaaaacga tfcccgaagcc caacctttca tagaaggcgg 6060 
cggfcggaatc gaaatctcgt agcacgtgtc agtcctgctc ctcggccacg aagtgcacg 6119 

<210> 127 
<211> 5855 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pCXLamlnt Plasmid (Wildtype Integrase) 
<400> 127 

gtcgacattg attattgact agttattaat agtaatcaat tacggggtca ttagttcata 60 
gcccatatat ggagttccgc gttacataac ttacggtaaa tggcccgcct ggctgaccgc 12 0 
ccaacgaccc ccgcccattg acgtcaataa tgacgtatgt fccccafcagfca acgccaatag 18 0 
ggactttcca ttgacgtcaa tgggtggact atttacggta aactgcccac ttggcagtac 240 
atcaagtgta tcatatgcca agtacgcccc ctattgacgt caatgacggt aaatggcccg 3 00 
cctggcatta tgcccagtac atgaccttat gggactttcc tacttggcag tacatctacg 360 
tattagtcat cgctattacc afcgggfccgag gtgagcccca cgttctgctt cactctcccc 42 O 
atctcccccc cctccccacc cccaattttg tatttattta ttttttaatt attttgtgca 480 
gcgatggggg cggggggggg gggggcgcgc gccaggcggg gcggggcggg gcgaggggcg 54 0 
gggcggggcg aggcggagag gtgcggcggc agccaatcag agcggcgcgc tccgaaagtt 60 0 
tccttttatg gcgaggcggc ggcggcggcg gccctataaa aagcgaagcg cgcggcgggc 66 0 
gggagtcgct gcgttgcctt cgccccgtgc cccgctccgc gccgcctcgc gccgcccgcc 72 0 
ccggcfcctga ctgaccgcgt tactcccaca gg^gagcggg cgggacggcc cttctcctcc 780 
gggctgtaat tagcgcttgg tttaatgacg gctcgtttct tttctgtggc tgcgtgaaag 84 0 
ccttaaaggg ctccgggagg gccctttgtg cgggggggag cggctcgggg ggfcgcgfcgcg 900 
tgfcgfcgtgfcg cgtggggagc gccgcgtgcg gcccgcgctg cccggcggct gtgagcgctg 960 
cgggcgcggc gcggggcttt gtgcgctccg cgtgtgcgcg aggggagcgc ggccgggggc 102O 
ggtgccccgc ggtgcggggg ggctgcgagg ggaacaaagg ctgcgtgcgg ggfcgtgfcgcg 10 80 
tgggggggtg agcagggggt gtgggcgcgg cggtcgggct gfcaacccccc cctgcacccc 1140 
cctccccgag ttgctgagca cggcccggct tcgggtgcgg ggctccgtgc ggggcgtggc 12 00 
gcggggcfccg ccgtgccggg cggggggtgg cggcaggtgg gggtgccggg cggggcgggg 12 60 
ccgcctcggg ccggggaggg ctcgggggag gggcgcggcg gccccggagc gccggcggct 13 2 0 
gtcgaggcgc ggcgagccgc agccattgcc ttttatggta atcgtgcgag agggcgcagg 13 80 
gacttccttt gtcccaaatc tggcggagcc gaaatctggg aggcgccgcc gcaccccctc 1440 
tagcgggcgc gggcgaagcg gtgcggcgcc ggcaggaagg aaatgggcgg ggagggcctt 1500 
cgtgcgtcgc cgcgccgccg tccccttctc catctccagc ctcggggctg ccgcaggggg 1560 
acggctgcct tcggggggga cggggcaggg cggggtfccgg cttctggcgt gtgaccggcg 16 20 
gctctagagc ctctgctaac catgttcatg ccttcttctt fcfctcctacag ctcctgggca 1680 
acgtgctggt fcgfcfcgtgctg tctcatcatt ttggcaaaga attcatggga agaaggcgaa 1740 
gtcatgagcg ccgggattta ccccctaacc tttatataag aaacaatgga tattactgct 18 00 
acagggaccc aaggacgggt aaagagtttg gattaggcag agacaggcga atcgcaatca I8 60 
ctgaagctat acaggccaac attgagttat tttcaggaca caaacacaag cctctgacag 19 20 
cgagaatcaa cagtgataat tccgttacgt tacattcatg gcttgatcgc fcacgaaaaaa 1980 
tcctggccag cagaggaatc aagcagaaga cactcataaa ttacatgagc aaaattaaag 2 04 0 
caataaggag gggtctgcct gatgctccac ttgaagacat caccacaaaa gaaattgcgg 2100 
caatgctcaa tggatacata gacgagggca aggcggcgtc agccaagtta atcagatcaa 2160 
cactgagcga tgcattccga gaggcaatag ctgaaggcca tataacaaca aaccatgtcg 2220 
ctgccactcg cgcagcaaaa tcagaggtaa ggagatcaag acttacggct gacgaatacc 22 80 
tgaaaattta tcaagcagca gaatcatcac catgttggct cagacttgca atggaactgg 2 34 0 
ctgttgttac cgggcaacga gttggtgatt tatgcgaaat gaagtggtct gatatcgtag 24 00 
atggatatct ttatgtcgag caaagcaaaa caggcgtaaa aattgccatc ccaacagcat 24 6 0 
tgcatattga tgctctcgga atatcaatga aggaaacact tgataaatgc aaagagattc 2 52 0 
ttggcggaga aaccataatt gcatctactc gtcgcgaacc gctttcatcc ggcacagtat 2 5 80 
caaggtattt tatgcgcgca cgaaaagcat caggtctttc cttcgaaggg gatccgccta 2640 
cctttcacga gttgcgcagt ttgtctgcaa gactctatga gaagcagata agcgataagt 2700 
ttgctcaaca tcttctcggg cataagtcgg acaccatggc atcacagtat cgtgatgaca 27 6 o 
gaggcaggga gtgggacaaa attgaaatca aataagaatt cactcctcag gtgcaggctg 2 82 0 
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cctatcagaa 
tttttccctc 
gctaataaag 
tcggaaggac 
gtttggcaac 
cagtatatga 
ggttagattt 
tccttacatg 
gtccctcttc 
atagctgttt 
aagcataaag 
gcgctcactg 
tagtcagcaa 
tccgcccatt 
gcctcggcct 
tgcaaaaagc 
caaatttcac 
tcaatgtatc 
aggcggtttg 
cgttcggctg 
at cagggga t 
taaaaaggcc 
aaatcgacgc 
tccccctgga 
gtccgccttt 
cagttcggtg 
cgaccgctgc 
atcgccactg 
tacagagttc 
ctgcgctctg 
acaaaccacc 
aaaaggatct 
aaactcacgt 
tttaaattaa 
cagttaccaa 
catagttgcc 
ccccagtgct 
aaaccagcca 
ccagtctatt 
caacgttgtt 
attcagctcc 
agcggttagc 
actcatggtt 
ttctgtgact 
ttgctcttgc 
gctcatcatt 
atccagttcg 
cagcgtttct 
gacacggaaa 
gggttattgt 
ggttccgcgc 



ggtggtggct 
tgccaaaaat 
gaaatttatt 
atatgggagg 
atatgccata 
aacagccccc 
tttttatatt 
ttttactagc 
tcttatgaag 
cctgtgtgaa 
tgtaaagcct 
cccgctttcc 
ccatagtccc 
ctccgcccca 
ctgagctatt 
taacttgttt 
aaataaagca 
ttatcatgtc 
cgtattgggc 
cggcgagcgg 
aacgcaggaa 
gcgttgctgg 
tcaagt caga 
agctccctcg 
ctcccttcgg 
taggtcgttc 
gccttatccg 
gcagcagcca 
ttgaagtggt 
ctgaagccag 
gctggtagcg 
caagaagatc 
taagggattt 
aaatgaagtt 
tgcttaatca 
tgactccccg 
gcaatgatac 
gccggaaggg 
aattgttgcc 
gccattgcta 
ggttcccaac 
tccttcggtc 
atggcagcac 
ggtgagtact 
ccggcgtcaa 
ggaaaacgtt 
atgtaaccca 
gggtgagcaa 
tgttgaatac 
ctcatgagcg 
acatttcccc 



ggtgtggcca 
tatggggaca 
ttcattgcaa 
gcaaatcatt 
tgctggctgc 
tgctgtccat 
ttgttttgtg 
cagatttttc 
atccctcgac 
attgttatcc 
ggggtgccta 
agtcgggaaa 
gcccctaact 
tggctgacta 
ccagaagtag 
attgcagctt 
tttttttcac 
tggatccgct 
gctcttccgc 
tatcagctca 
agaacatgtg 
cgtttttcca 
ggtggcgaaa 
tgcgctctcc 
gaagcgtggc 
gctccaagct 
gtaactatcg 
ctggtaacag 
ggcctaacta 
fctaccttcgg 
gtggtttttt 
ctttgatctt 
tggtcatgag 
ttaaatcaat 
gtgaggcacc 
tcgtgtagat 
cgcgagaccc 
ccgagcgcag 
gggaagctag 
caggcatcgt 
gatcaaggcg 
ctccgatcgt 
tgcataattc 
caaccaagtc 
tacgggataa 
cttcggggcg 
ctcgtgcacc 
aaacaggaag 
tcatactctt 
gatacatatt 
gaaaagtgcc 



atgccctggc 
tcatgaagcc 
tagtgtgttg 
taaaacatca 
catgaacaaa 
tccttattcc 
ttattttttt 
ctcctctcct 
ctgcagccca 
gctcacaatt 
atgagtgagc 
cctgtcgtgc 
ccgcccatcc 
atttttttta 
tgaggaggct 
ataatggtta 
tgcattctag 
gcattaatga 
ttcctcgctc 
ctcaaaggcg 
agcaaaaggc 
taggctccgc 
c c c gac agga 
tgttccgacc 
gctttctcaa 
gggctgtgtg 
tcttgagtcc 
gattagcaga 
cggctacact 
aaaaagagtt 
tgtttgcaag 
ttctacgggg 
attatcaaaa 
ctaaagtata 
tatctcagcg 
aactacgata 
acgctcaccg 
aagtggtcct 
agtaagtagt 
ggtgtcacgc 
agttacatga 
tgtcagaagt 
tcttactgtc 
attctgagaa 
taccgcgcca 
aaaactctca 
caactgatct 
gcaaaatgcc 
cctttttcaa 
tgaatgtatt 
acctg 



tcacaaatac 
ccttgagcat 
gaattttttg 
gaatgagtat 

ggtggctata 

atagaaaagc 
ctttaacatc 
gactactccc 
agcttggcgt 
ccacacaaca 
taactcacat 
cagcggatcc 
cgcccctaac 
tttatgcaga 
tttttggagg 
caaataaagc 
ttgtggtttg 
at cggccaac 
actgactcgc 
gtaatacggt 
cagcaaaagg 
ccccctgacg 
ctataaagat 
ctgccgctta 
tgctcacgct 
cacgaacccc 
aacccggtaa 
gcgaggtatg 
agaaggacag 
ggtagctctt 
cagcagatta 
tctgacgctc 
aggatcttca 
tatgagtaaa 
atctgtctat 
cgggagggct 
gctccagatt 
gcaactttat 
tcgccagtta 
tcgtcgtttg 
tcccccatgt 
aagttggccg 
atgccatccg 
tagtgtatgc 
catagcagaa 
aggatcttac 
tcagcatctt 
gcaaaaaagg 
tattattgaa 
tagaaaaata 



cactgagatc 
ctgacttctg 
tgtctctcac 
ttggtttaga 
aagaggtcat 
cttgacttga 
cctaaaattt 
agtcatagct 
aatcatggtc 
tacgagccgg 
taattgcgtt 
gcatctcaat 
tccgcccagt 
ggccgaggcc 
cctaggcttt 
aatagcatca 
tccaaactca 
gcgcggggag 
tgcgctcggt 
tatccacaga 
ccaggaaccg 
agcatcacaa 
accaggcgtt 
ccggatacct 
gtaggtatct 
ccgttcagcc 
gacacgactt 
taggcggtgc 
tatttggtat 
gatccggcaa 
cgcgcagaaa 
agtggaacga 
cctagatcct 
cttggtctga 
ttcgttcatc 
taccatctgg 
tatcagcaat 
ccgcctccat 
atagtttgcg 
gtatggcttc 
tgtgcaaaaa 
c agt gt tat c 
taagatgctt 
ggcgaccgag 
ctttaaaagt 
cgctgttgag 
ttactttcac 
ga at aagggc 
gcatttatca 
aacaaatagg 



2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 
5100 
5160 
5220 
5280 
5340 
5400 
5460 
5520 
5580 
5640 
5700 
5760 
5820 
5855 



<210> 128 
<211> 303 
<212> DMA 

<213> Artificial Sequence 
<220> 

<223 > Human FER-1 Promoter 



<400> 128 

tccatgacaa 

gacgccaccg 

gcgcgcgagg 

ccggaaggag 

gcggctataa 

acc 



agcacttttt 
ctgtcccaga 
gcctccagcg 
cgggctcggg 
gagaccacaa 



gagcccaagc 
ggcagtcggc 
gccgcccctc 
gcgggcggcg 
gcgacccgca 



ccagcctagc 
taccggtccc 
ccccacagca 
ctgattggcc 
gggccagacg 



tcgagctaaa 
cgctcccgag 

ggggcggggt 
ggggcgggcc 
ttcttcgccg 



cgggcacaga 60 
ctccgccaga 120 
cccgcgccca 180 
tgacgccgac 240 
agagtcgggt 300 
303 
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<210> 129 
<211> 6521 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pIRES-BSR Plasmid 



<400> 129 

tcaatattgg 

ttggccattg 

aatatgaccg 

gtcattagtt 

gccfcggctga 

agtaacgcca 

ccacttggca 

cggtaaatgg 

gcagtacatc 

caatgggcgt 

caatgggagt 

cgatcgcccg 

agcagagctc 

agttaaattg 

gactctctta 

ggttacaaga 

cttgcgtttc 

aggtgtccac 

ataggctagc 

tctccctccc 

tttgtctata 

cctggccctg 

caaggtctgt 

acgtctgtag 

ggccaaaagc 

gtgagttgga 

ctgaaggatg 

tgctttacat 

gtggttttcc 

acatttctca 

atgaggataa 

cggcagtaca 

ttggtagtgc 

cttattctga 

agttgatttc 

tcaaaactac 

ataccaagct 

gataagatac 

tatttgtgaa 

agttaacaac 

tttttaaagc 

tggcgtaata 

ggcgaatgga 

gcgtgaccgc 

ttctcgccac 

tccgatttag 

gtagtgggcc 

ttaatagtgg 

ttgatttata 

aaatatttaa 

tttctcctta 

ggcctgaaat 

agctgtggaa 

gtatgcaaag 

cagcaggcag 

taactccgcc 

gactaatttt 

agtagtgagg 



ccattagcca 
catacgttgt 
ccatgttggc 
catagcccat 
ccgcccaacg 
atagggactt 
gtacatcaag 
cccgcctggc 
tacgtattag 
ggatagcggt 
ttgttttggc 
ccccgttgac 
gtttagtgaa 
ctaacgcagt 
aggtagcctt 
caggtttaag 
tgataggcac 
tcccagttca 
ctcgagaatt 
ccccccctaa 
tgtgattttc 
tcttcttgac 
tgaatgtcgt 
cgaccctttg 
cacgtgtata 
tagttgtgga 
cccagaaggt 
gtgtttagtc 
tttgaaaaac 
acaagatcta 
taaacatcat 
tattgaagcg 
agtttcgaat 
cgaagtagat 
agactatgca 
gattgaagaa 
tggcgggcgg 
attgatgagt 
atttgtgatg 
aacaattgca 
aagtaaaacc 
gcgaagaggc 
cgcgccctgt 
tacacttgcc 
gttcgccggc 
agctttacgg 
atcgccctga 
actcttgttc 
agggattttg 
cgcgaatttt 
cgcatctgtg 
aacctctgaa 
tgtgtgtcag 
catgcatctc 
aagtatgcaa 
catcccgccc 
ttttatttat 
aggctttttt 



tattattcat 
atctatatca 
attgattatt 
atatggagtt 
acccccgccc 
tccattgacg 
tgtatcatat 
attatgccca 
tcatcgctat 
ttgactcacg 
accaaaatca 
gcaaatgggc 
ccgtcagatc 
cagtgcttct 
gcagaagttg 
gagaccaata 
ctattggtct 
attacagctc 
cacgcgtcga 
cgttactggc 
caccatattg 
gagcattcct 
gaaggaagc a 
caggcagcgg 
agatacacct 
aagagtcaaa 
accccattgt 
gaggttaaaa 
acgatgataa 
gaattagtag 
gtgggagcgg 
tatataggac 
ggacaaaagg 
agaagtattc 
ccagattgtt 
ctcattccac 
ccgcttccct 
ttggacaaac 
ctattgcttt 
ttcattttat 
tctacaaatg 
ccgcaccgat 
agcggcgcat 
agcgccctag 
tttccccgtc 
cacctcgacc 
tagacggttt 
caaactggaa 
ccgatttcgg 
aacaaaatat 
cggtatttca 
agaggaac 1 1 
ttagggtgtg 
aattagtcag 
agcatgcatc 
ctaactccgc 
gcagaggccg 
ggaggcctag 



tggttatata 
taatatgtac 
gactagttat 
ccgcgttaca 
attgacgtca 
tcaatgggtg 
gccaagtccg 
gtacatgacc 
taccatggtg 
gggatttcca 
acgggacttt 
ggtaggcgtg 
act agaagc t 
gacacaacag 
gt cgtgaggc 
gaaactgggc 
tactgacatc 
ttaaggctag 
gcatgcatct 
cgaagccgct 
ccgtcttttg 
aggggtcttt 
gttcctctgg 
aaccccccac 
gcaaaggcgg 
tggctctcct 
atgggatctg 
aaacgtctag 
gcttgccaca 
aagtagcgac 
caattcgtac 
gagtaactgt 
attttgacac 
gagtggtaag 
ttgtgttaat 
tcaaatatac 
ttagtgaggg 
cacaactaga 
atttgtaacc 
gtttcaggtt 
tggtaaaatc 
cgcccttccc 
taagcgcggc 
cgcccgctcc 
aagctctaaa 
gcaaaaaact 
ttcgcccttt 
caacactcaa 
cctattggtt 
taacgtttac 
caccgcatac 
ggttaggtac 
gaaagtcccc 
caaccaggtg 
tcaattagtc 
ccagttccgc 
aggccgcctc 
gcttttgcaa 



gcataaatca 
atttatattg 
taatagtaat 
taacttacgg 
ataatgacgt 
gagtatttac 
ccccctattg 
ttacgggact 
atgcggtttt 
agtctccacc 
ccaaaatgtc 
tacggtggga 
ttattgcggt 
tctcgaactt 
actgggcagg 
ttgtcgagac 
cactttgcct 
agtacttaat 
agggcggcca 
tggaataagg 
gcaatgtgag 
cccctctcgc 
aagcttcttg 
ctggcgacag 
cacaacccca 
caagcgtatt 
atctggggcc 
gccccccgaa 
acccaccatg 
agagaagatt 
gaaaacagga 
ttgtgcagaa 
gattgtagct 
tccttgtggt 
agaaatgaat 
ccgaaattaa 
ttaatgcttc 
atgcagtgaa 
attataagct 
cagggggaga 
cgataaggat 
aacagttgcg 
gggtgtggtg 
tttcgctttc 
tcgggggctc 
tgatttgggt 
gacgttggag 
ccctatctcg 
aaaaaatgag 
aatttcgcct 
gcggatctgc 
cttctgaggc 
aggctcccca 
tggaaagtcc 
agcaaccata 
ccattctccg 
ggcctctgag 
aaagcttgat 



atattggcta 
gctcatgtcc 
caattacggg 
taaatggccc 
atgttcccat 
ggtaaactgc 
acgtcaatga 
ttcctacttg 
ggcagtacac 
ccattgacgt 
gtaacaactg 
ggtctatata 
agtttatcac 
aagctgcagt 
taagtatcaa 
agagaagact 
ttctctccac 
acgactcact 
attccgcccc 
ccggtgtgcg 
ggcccggaaa 
caaaggaatg 
aagacaaaca 
gtgcctctgc 
gtgccacgtt 
caacaagggg 
tcggtgcaca 
ccacggggac 
aaaacattta 
acaatgcttt 
gaaatcattt 
gccattgcga 
gttagacacc 
atgtgtaggg 
ggcaagttag 
aagttttacc 
gagcagacat 
aaaaatgctt 
gcaataaaca 
tgtgggaggt 
cgatccgggc 
cagcctgaat 
gttacgcgca 
ttcccttcct 
cctttagggt 
gatggttcac 
tccacgttct 
gtctattctt 
ctgatttaac 
gatgcggtat 
gcagcaccat 
ggaaagaacc 
gcaggcagaa 
ccaggctccc 
gtcccgcccc 
ccccatggct 
ctattccaga 
tcttctgaca 
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caacagtctc gaacttaagg ctagagccac 
ttctccggcc gctfcgggtgg agaggctatt 
ctgctcfcgat gccgccgtgt fcccggctgtc 
gaccgacctg tccggtgccc fcgaatgaact 
ggccacgacg ggcgttcctt gcgcagctgt 
ctggctgcta ttgggcgaag tgccggggca 
cgagaaagta tccatcatgg ctgatgcaat 
ctgcccattc gaccaccaag cgaaacatcg 
cggtcttgtc gatcaggatg atctggacga 
gttcgccagg ctcaaggcgc gcatgcccga 
tgcctgcttg ccgaatatca tggtggaaaa 
ccggctgggt. gtggcggacc gctatcagga 
agagcttggc ggcgaatggg ctgaccgctt 
ttcgcagcgc atcgccttct atcgccttct 
ttcgaaatga ccgaccaagc gacgcccaac 
tttattttca ttacatctgt gfcgtfcggttt 
cgtatggtgc actctcagta caatctgctc 
cccgccaaca cccgctgacg cgccctgacg 
acaagctgtg accgtctccg ggagctgcat 
acgcgcgaga cgaaagggcc tcgtgatacg 
aatggtttct tagacgtcag gtggcacttt 
tttatttttc taaatacatt caaatatgta 
gcttcaataa tattgaaaaa ggaagagtat 
tccctttttt gcggcatttt gccttcctgt 
aaaagatgct gaagatcagt tgggtgcacg 
cggtaagatc cttgagagtt ttcgccccga 
agttctgcta tgfcggcgcgg tattatcccg 
ccgcatacac tafctcfccaga afcgacttggt 
tacggatggc atgacagtaa gagaattatg 
tgcggccaac ttacttctga caacgatcgg 
caacatgggg gatcatgtaa ctcgccttga 
accaaacgac gagcgtgaca ccacgatgcc 
attaactggc gaactactta ctctagcttc 
ggataaagtt gcaggaccac ttctgcgctc 
taaatctgga gccggtgagc gtgggtctcg 
taagccctcc cgtatcgtag ttatctacac 
aaatagacag atcgctgaga taggtgcctc 
agtttactca tatatacttt agattgattt 
ggtgaagatc ctttttgata atctcatgac 
ctgagcgtca gaccccgtag aaaagatcaa 
cgtaatctgc tgcttgcaaa caaaaaaacc 
tcaagagcta ccaactcttt ttccgaaggt 
tactgtcctt ctagtgtagc cgtagttagg 
tacatacctc gctcfcgctaa tcctgttacc 
tcttaccggg ttggactcaa gacgatagtt 
ggggggt teg tgcacacagc ccagcttgga 
acagegtgag ctatgagaaa gcgccacgct 
ggtaagegge agggteggaa caggagagcg 
gtatctttat agtcctgtcg ggtttcgcca 
ctegtcaggg gggeggagee tatggaaaaa 
ggccttttgc tggccttttg ctcacatggc 
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catgattgaa caagatggat tgeaegcagg 3 540 
eggctatgae tgggcacaac agacaategg 3 600 
agegcagggg cgcccggttc tttttgtcaa 3 660 
gcaggacgag gcagcgcggc tatcgtggct 3 72 0 
getcgaegtt gtcactgaag egggaaggga 3 780 
ggatctcctg tcatctcacc fcfcgcfcccfcgc 3 840 
gcggcggctg cataegcttg atccggctac 3 900 
catcgagega gcacgtactc ggatggaagc 3 960 
agagcatcag gggctcgcgc cagccgaact 4 02 0 
eggegaggat ctcgtcgtga cccatggcga 4 080 
tggccgcttt tctggattca tcgactgtgg 4140 
catagcgttg gctacccgtg atattgetga 4200 
cctcgtgctt taeggtateg ccgctcccga 4260 
tgacgagttc ttctgagegg gactctgggg 4 320 
ctgccatcac gatggecgea ataaaatatc 4380 
tttgtgtgaa tegatagega taaggatccg 4440 
tgatgecgea tagttaagee agccccgaca 45 0 0 
ggcttgtctg ctcccggcat ccgcttacag 4560 
gtgtcagagg ttttcaccgt catcaccgaa 4620 
cctattttta taggttaatg fccatgafcaat 4680 
teggggaaat gtgcgcggaa cccctatttg 4 740 
tccgctcatg agacaataac cctgataaat 4800 
gagtattcaa catttcegtg tcgcccttat 4860 
ttttgetcac ccagaaacgc tggtgaaagt 4 920 
agtgggttac atcgaactgg atctcaacag 49 80 
agaacgtttt ccaatgatga gcacttttaa 5040 
tattgacgee gggcaagagc aacteggteg 5100 
tgagtactca ccagtcacag aaaagcatct 5160 
cagtgctgcc ataaccatga gtgataacac 5220 
aggaccgaag gagctaaccg ettttttgea 52 80 
tcgttgggaa ccggagctga atgaagecat 534 0 
tgtagcaatg geaacaaegt tgegcaaact 5400 
ccggcaacaa ttaatagact ggatggaggc 54 60 
ggcccttccg gctggctggt . ttattgctga 5520 
eggtatcatt gcagcactgg ggccagatgg 5580 
gaeggggagt caggcaacta tggatgaacg 5640 
actgattaag cattggtaac tgtcagacca 5700 
aaaacttcat ttttaattta aaaggatcta 5760 
caaaatccct taacgtgagt tttcgttcca 5820 
aggatcttct tgagatcctt tttttctgcg 5880 
accgctacca gcggtggttt gtttgccgga 5940 
aactggcttc ageagagege agataccaaa 6000 
ccaccacttc aagaactctg tagcaccgcc 6060 
agtggctgct gccagtggcg ataagtegtg 6120 
aceggataag gcgcagcggt egggctgaac 6180 
gcgaacgacc tacaccgaac tgagatacct 6240 
tcccgaaggg agaaaggegg acaggtatcc 63 00 
cacgagggag cttccagggg gaaacgcctg 63 60 
cctctgactt gagegtcgat ttttgtgafcg 6420 
cgccagcaac gcggcctttt tacggttcct 6480 
tcgacagatc t 6521 
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I. | 1 Claim Nos.: 

because they relate to parts of the international application that do not comply with the prescribed requirements to 
such an extent that no meaningful international search can be carried out, specifically: 



3. [ I Claim Nos.: 

because they are dependent claims and are not drafted in accordance with the second and third sentences of Rule 
6.4(a). - 



Box II Observations where unity of invention is lacking (Continuation of Item 2 of first sheet) 



This Inlernational Searching Authority found multiple inventions in this internati *na! application, as follows: 
Please See Continuation Sheet 



1 . tXj As all required additional search fees were timely paid by the applicant, this international search report covers all 

searchable claims. 

2. 1 1 As all searchable claims could be searched without effort justifying an additional fee, this Authority did not invite 
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3. | | As only some of the required additional search fees were timely paid by the applicant, this international search 

report covers only those claims for which fees were paid, specifically claims Nos.: 



1 | No required additional search fees were limely paid by the applicant. Consequently, this inLemational search report 
is restricted to the invention first mentioned in the claims; it is covered by claims Nos.: 
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This application contains the following inventions or groups of inventions which are not so 
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inventions. to be searched/ the appropriate additional search fees must be paid. 

Group I, claim(s) 1-64, 67-71, 79, 84-96, 91-109, and 123, drawn to eukaryotic 
recombxnogsnic chromosomes used for introducing heterologous nucleic acids into a 
chromosome and the resulting cells. 

Group II, clairn(s) 65-66 and 87-88, drawn to a lambda intR mutein. 

Group III, claim(3) 72-78, drawn to the production of transgenic animals. 

Group IV, claim(s) 80, drawn to the production of an artificial chromosome library. 

Group V, claim(s) 81-83, drawn to a library of cells for genomic screening. 

Group VI, claim(s) 8 9-90, drawn to a modified iron-induced promoter. 

Group VII, claim(s) 109-122, drawn to a method for screening compounds and their effects on 
regulatory regions. 

and it considers that .he International Application doe ■ r.c: comply with the requirements 
of unity of invention (Rules 13.1, 13.2 and 13.3) for tne reasons indicated below: 

The inventions listed as Groups I-VII do not relate to a single inventive concept under PCT 
Rule 13.1 because, under PCT Rule 13.2, they lack the same or corresponding special 
technical features for the following reasons: 

The special technical feature of Group I which defines an advance over the art is a 
eukaryotic chromosome containing recombinogenic sites that can be used to introduce 
heterologous nucleic acids into chromosomes, and cells containing these recombinogenic 
chromosomes . 

The special technical feature of Group II involves a lambda intR mutein. This feature 
defines an advance over Group I in that it involves a protein that is not required for the 
technical feaures as set forth above in Group I. 

The special technical feature of Group III involves the production of a transgenic animal, 
which represents a second method for using the invention a3 set forth above in Group I. 
The 3peciai technical feature of Group IV involves the production of artificial chromosome 
expression system libraries, which represents a third method for using the invention as set 
forth above in Group I . 

The special technical feature of Group V involves a library of cells containing the 
artificial chromosome expression system libraries set foth above in Group IV, and 
represents a first product resulting from Group IV. 

The special technical feature of Group VI involves a modified iron-inducibl e promoter. 
This feature defines an advance over Group I in that it involves a promoter that i3 not 
required for the technical features as set forth above in Group I. 

The special technical feature of Group VII involves a method for screening compounds for 
their effect on regulatory regions, which represents a fourth method of using the invention 
a3 set forth above in Group I. 
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AMENDED CLAIMS 
[received by the International Bureau on 27 January 2003 (27.01 .2003); 
Original claims 22, 29, 39, 40, 68, 104, 107, 108, 114 and 121 replaced by 
Amended Claims 22, 29, 39, 40, 68, 104, 107, 108, 1 14 and 121.; 
Remaining claims unchanged] 

13. The chromosome of claim 6 that is an artificial chromosome 
expression system (ACes). 

14. A platform artificial chromosome expression system (ACes) 
comprising one or a plurality of sites that participate in recombinase 

5 catalyzed recombination. 

15. The ACes of claim 14 that contains one site. 

16. The ACes of claim 14 that is predominantly heterochromatin. 

17. The ACes of claim 14 that contains no more than about 
30%, 40%, 50%, 60%, 70%, 80%, 90% or 95% euchromatin. 

10 18. The ACes of claim 14 that is a plant ACes, 

19. The ACes of claim 14 that is an animal ACes. 

20. The ACes of claim 14 that is selected from a fish, insect, 
reptile, amphibian, arachnid or a mammalian ACes. 

21 . The ACes of claim 14 that is a fish ACes. 

15 22. The artificial chromosome expression system (ACes) of claim 

14, wherein the recombinase and site(s) are from the Cre/lox system of 
bacteriophage P1 , the int/att system of lambda phage, the FLP/FRT 
system of yeast, the Gin/gix recombinase system of phage Mu, the Cin 
recombinase system, the Pin recombinase system of E. coff, the R/RS 
. 20 system of the pSR1 plasmid, or any combination thereof. 

23. A method of introducing heterologous nucleic acid into a 
chromosome, comprising: 

contacting a chromosome of any of claims 1 or 14 with a nucleic 
acid molecule comprising both the heterologous nucleic acid and a 
25 recombination site, in the presence of a recombinase that promotes 

recombination between the sites in the chromosome and in the nucleic 
acid molecule. 
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24. The method of claim 23, wherein the recombinase is 
selected from the group consisting of Cre, Gin, Cin, Pin, FLP, a phage 
integrase and R from the pSRI plasmid. 

25. The method of claim 23, wherein the nucleic acid molecule 
5 encodes a therapeutic protein, antisense nucleic acid, or comprises an 

artificial chromosome. 

26. The method of claim 25, wherein the nucleic acid molecule 
comprises a yeast artificial chromosomes (YAC), a bacterial artificial 
chromosome (BAC) or an insect artificial chromosome (IAC). 

10 27. A combination, comprising, the chromosome of claim 1 and a 

first vector comprising the cognate recombination site, wherein the 
cognate recombination site is a site that recombines with the site 
engineered into the chromosome. 

28. The combination of claim 27, further comprising nucleic acid 
15 encoding a recombinase, wherein the nucleic acid is on a second vector 

or on the first vector, or on the ACes under an inducible promoter. 

29. The combination of claim 28, wherein the recombinase and 
sites are from the Cre/iox system of bacteriophage P1, the int/att system 
of lambda phage, the FLP/FRT system of yeast, the Gin/gix recombinase 

20 system of phage Mu, the Pin recombinase system of E. coff, the R/RS 
system of the pSR1 plasmid, or any combination thereof. 

30. The combination of claim 28, wherein a vector is the plasmid 
pCXLamlntR. 

31 . The combination of claim 27, wherein a vector is the plasmid 
25 pDsRedNl-attB. 

32. A kit, comprising the combination of claim 27 and optionally 
instructions for introducing heterologous nucleic acid into the 
chromosome. 
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33. A method for introducing heterologous nucleic acid into a 
platform artificial chromosome, comprising: 

(a) mixing an artificial chromosome comprising at least a first 
recombination site and a vector comprising at least a second 

5 recombination site and the heterologous nucleic acid; 

(b) incubating the resulting mixture in the presence of at least one 
recombination protein under conditions whereby recombination between 
the first and second recombination sites is effected, thereby introducing 
the heterologous nucleic acid into the artificial chromosome. 

10 34. The method of claim 33, wherein the artificial chromosome is 

an ACes. 

35. The method of claim 33, wherein said mixing step (a) is 
conducted in cells ex vivo. 

36. The method of claim 33, wherein said mixing step (a) is 
15 conducted extracellularly in an in vitro reaction mixture. 

37. The method of claim 33, wherein the at least one 
recombination protein is encoded by a bacteriophage selected from the 
group consisting of bacteriophage lambda, phi 80, P22, P2, 186, P4 and 
PI. 

20 38. The method of claim 37, wherein the at least one 

recombination protein is encoded by bacteriophage lambda, or mutants 
thereof. 

39. The method of claim 33, wherein at least one recombination 
protein is selected from the group consisting of Int, IHF, Xis, Cre, yS, Tn3 

25 resolvase, Hin, Gin, Cin and Flp. 

40. The method of claim 33, wherein the recombination sites are 
selected from the group consisting of att and lox P sites. 
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66. The lambda-intR mutein of claim 65, wherein the lambda-intR 
mutein comprises SEQ ID NO:37. 

67. The method of claim 46 wherein the promoterless marker is 
transcriptionally downstream of the heterologous nucleic acid, wherein 

5 the heterologous nucleic acid encodes a heterologous protein, and 

wherein the expression level of the selectable marker is transcriptionally 
linked to the expression level of the heterologous protein. 

68. The method of claim 67, wherein the selectable marker and 
the heterologous nucleic acid are transcriptionally linked by the presence 

10 of an IRES between them. 

69. The method of claim 68, wherein the selectable marker is 
selected from the group consisting of an antibiotic resistance gene, and a 
detectable protein, wherein the detectable protein is chromogenic or 
fluorescent. 

15 70. The method of claim 69, wherein the selectable marker is 

selected from the group consisting of green fluorescent protein (GFP), red 
fluorescent protein (RFP), blue fluorescent protein (BFP), and E. coli 
histidinol dehydrogenase. 

71. The method of claim 67 further comprising expressing the 
20 heterologous protein and isolating the heterologous protein. 

72. A method for producing a transgenic animal, comprising 
introducing a platform->4 Ces into an embryonic cell. 

73. The method of claim 72, wherein the embryonic cell is a 
stem cell. 

25 74. The method of claim 72, wherein the embryonic cell is in an 

embryo. 

75. The method of claim 72, wherein the p!atform-/4Ces 
comprises heterologous nucleic acid that encodes a therapeutic product. 
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a sequence of nucleotides that targets the vector to an 
amplifiable region of a chromosome. 

92. The vector of claim 91, wherein the amplifiable region 
comprises heterochromatic nucleic acid. 
5 93. The vector of claim 91, wherein the amplifiable region 

comprises rDNA. 

94-. The vector of claim 93, wherein the rDNA comprises an 
intergenic spacer. 

95. The vector of claim 91, further comprising nucleic acid 
10 encoding a selectable marker that is not operably associated with any 

promoter. 

96. The vector of claim 91, wherein the chromosome is a 
mammalian chromosome. 

97. The vector of claim 91, wherein the chromosome is a plant 
15 chromosome. 

98. A cell of claim 57 that is a plant cell, wherein the ACes 
platform is a MAC. 

99. The plant cell of claim 98, wherein the MAC comprises 
transcriptional regulatory sequence of nucleotides derived from plants. 

20 100. The plant cell of claim 99, wherein the regulatory sequence 

is selected from the group consisting of promoters, terminators, 
enhancers, silencers and transcription factor binding sites. 

101 . A cell of claim 57 that is an animal cell, wherein the ACes 
. platform is a plant artificial chromosome (PAC). 
25 102. The cell of claim 101 that is a mammalian cell. 

103. The cell of claim 98, wherein the MAC comprises 
transcriptional regulatory sequence of nucleotides derived from plants. 

104-. The cell of claim 102, wherein the PAC comprises 
transcriptional regulatory sequence of nucleotides derived from animals. 
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105. The cell of claim 104, wherein the regulatory sequence is 
selected from the group consisting of promoters, terminators, enhancers, 
silencers and transcription factor binding sites. 

106. A method, comprising: 

5 introducing a vector of claim 91 into a cell; 

growing the cells; and 

selecting a cell comprising an artificial chromosome that comprises 
one or more repeat regions. 

107. The method of claim 106, wherein a sufficient portion of the 
10 vector integrates into a chromosome in the cell to result in amplification 

of chromosomal DNA. 

108. The method of claim 106, wherein the artificial chromosome 
is an ACes. 

109. A method for screening, comprising: 

15 contacting a cell comprising a reporter ACes with test compounds 

or known compounds, wherein: 

the reporter ACes comprises one or a plurality of reporter 
constructs; 

a reporter construct comprises a reporter gene in operative linkage 
20 with a regulatory region responsive to test or known compounds; and 

detecting any increase or decrease in signal output from the 
reporter, wherein a change in the signal is indicative of activity of the test 
or known compound on the regulatory region. 

1 10. The method of claim 109, wherein the reporter is operatively 
25 linked to a promoter that controls expression of a gene in a signal 
transduction pathway, whereby activation or reduction in the signal 
indicates that the pathway is activated or down-regulated by the test 
compound. 
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111. The method of claim 109, wherein the reporter in the 
construct encodes drug resistance or encodes a fluorescent protein. 

112. The method of claim 111, wherein the fluorescent protein is 
selected from the group consisting of red, green and blue fluorescent 

5 proteins. 

113. The method of claim 109, wherein the ACes comprises a 
plurality of reporter-linked constructs, each with a different reporter, 
whereby the pathway(s) affected by the test compounds can be 
elucidated. 

lO 114. The method of claim 109, wherein a reporter is operativeiy 

linked to a promoter that is transcriptionally regulated in response to DNA 
damage, and the test compounds are genotoxicants. 

115. The method of claim 114, wherein the DNA damage is 
induced by apoptosis, necrosis or cell-cycle perturbations. 
15 116. The method of claim 114, wherein unknown compounds are 

. screened to assess whether they are genotoxicants. 

117. The method of claim 114, wherein the promoter is a 
cytochrome P450-profiled promoter. 

118. The method of claim 1 14, wherein the ceil is in a transgenic 
20 animal and toxicity is assessed in the animal. 

1.19. The method of claim 109, wherein: 

the cell is a patient cell sample; the patient has a disease; 

the regulatory region is one targeted by a drug or drug regimen; 

and 

25 the method assesses the effectiveness of a treatment for the 

disease for the particular patient. 

120. The method of claim 119, wherein the cell is a tumor cell. 

121. The method of claim 109, wherein' the cell is a stem cell or a 
progenitor cell, whereby expression of the reporter is operativeiy linked to 



AMENDED SHEET (ARTICLE 19) 





WO 2002/097059 



PCT/US2002/01 7452 



160 



a regulatory region expressed in the cells to thereby identify stem cells or 
progenitor cells. 

122. The method of claim 109, wherein the cell is in an animal; 
and the method comprises whole-body imaging to monitor expression of 

5 the reporter in the animal. 

123. A reporter ACes comprises one or a plurality of reporter 
constructs, wherein the reporter construct comprises a reporter gene in 
operative linkage with a regulatory region responsive to test or known 
compounds. 
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CHROMOSOME-BASED PLATFORMS 
RELATED APPLICATIONS 

Benefit of priority to U.S. provisional application Serial No. 
60/294,758, filed May 30, 2001, to Perkins, etaL, entitled 
"CHROMOSOME-BASED PLATFORMS" and to U.S. provisional application 
Serial No. 60/366,891, filed March 21, 2002, to Perkins, etaL, entitled 
5 "CHROMOSOME-BASED PLATFORMS" is claimed. Where permitted, the 
subject matter of which are herein incorporated by reference in their 
entirety. 

This application is related to Provisional Application No. 
60/294,687, filed May 30, 2001, by CARL PEREZ AND STEVEN 

lO FABIJANSKI entitled PLANT ARTIFICIAL CHROMOSOMES, USES 
THEREOF AND METHODS FOR PREPARING PLANT ARTIFICIAL 
CHROMOSOMES and to U.S. Provisional Application No. 60/296,329, 
filed June 4, 2001, by CARL PEREZ AND STEVEN FABIJANSKI entitled 
PLANT ARTIFICIAL CHROMOSOMES, USES THEREOF AND METHODS 

15 FOR PREPARING PLANT ARTIFICIAL CHROMOSOMES. This application 
also is related to U.S. Provisional Application No. 60/294,758, filed May 
30, 2001, by EDWARD PERKINS et at., entitled CHROMOSOME-BASED 
PLATFORMS and to U.S. Provisional Application No. 60/366,891, filed 
March 21, 2002, by by EDWARD PERKINS etaL. entitled 

20 CHROMOSOME-BASED PLATFORMS. This application is also related to 
U.S. application Serial Nos. (attorney dkt nos. 24601-419 and 419PC), 
filed on the same day herewith, entitled PLANT ARTIFICIAL 
CHROMOSOMES, USES THEREOF AND METHODS OF PREPARING 
PLANT ARTIFICIAL CHROMOSOMES to Perez et al. . 

25 This application is related to U.S. application Serial No. 

08/695,191 , filed August 7, 1996 by GYULA HADLACZKY and ALADAR 
SZALAY, entitled ARTIFICIAL CHROMOSOMES, USES THEREOF AND 
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METHODS FOR PREPARING ARTIFICIAL CHROMOSOMES, now U.S. 
Patent No. 6,025,155. This application is also related to U.S. application 
Serial No. 08/682,080, filed July 15, 1996 by GYULA HADLACZKY and 
ALADAR SZALAY, entitled ARTIFICIAL CHROMOSOMES, USES THEREOF 
5 AND METHODS FOR PREPARING ARTIFICIAL CHROMOSOMES, now 
U.S. Patent No. 6,077,697. This application is also related U.S. 
application Serial No. 08/629,822, filed April 10, 1996 by GYULA 
HADLACZKY and ALADAR SZALAY, entitled ARTIFICIAL 
CHROMOSOMES, USES THEREOF AND METHODS FOR PREPARING 

10 ARTIFICIAL CHROMOSOMES (now abandoned), and is also related to 
copending U.S. application Serial No. 09/096,648, filed June 12, 1998, 
by GYULA HADLACZKY and ALADAR SZALAY, entitled ARTIFICIAL 
CHROMOSOMES, USES THEREOF AND METHODS FOR PREPARING 
ARTIFICIAL CHROMOSOMES and to U.S. application Serial No. 

15 09/835,682, April 10, 1997 by GYULA HADLACZKY and ALADAR 

SZALAY, entitled ARTIFICIAL CHROMOSOMES, USES THEREOF AND 
METHODS FOR PREPARING ARTIFICIAL CHROMOSOMES (now 
abandoned). This application is also related to copending U.S. application 
Serial No. 09/724,726, filed November 28, 2000, U.S. application Serial 

20 No. 09/724,872, filed November 28, 2000, U.S. application Serial No. 
09/724,693, filed November 28, 2000, U.S. application Serial No. 
09/799,462, filed March 5, 2001, U.S. application Serial No. 
09/836,911, filed April 17, 2001, and U.S. application Serial No. 
10/125,767, filed April 17, 2002, each of which is by GYULA 

25 HADLACZKY and ALADAR SZALAY, and is entitled ARTIFICIAL 

CHROMOSOMES, USES THEREOF AND METHODS FOR PREPARING 
ARTIFICIAL CHROMOSOMES. This application is also related to 
International PCT application No. WO 97/40183. Where permitted the 



subject matter of each of these provisional applications, international 
applications, and applications is incorporated by reference in its entirety. 
FIELD OF INVENTION 

Artificial chromosomes, including ACes, that have been engineered 
5 to contain available sites for site-specific, recombination-directed 

integration of DNA of interest are provided. These artificial chromosomes 
permit tractable, efficient, rational engineering of the chromosome. 
BACKGROUND 

Artificial chromosomes 

10 A variety of artificial chromosomes for use in plants and animals, 

particularly higher plants and animals are available. In particular, U.S. 
Patent Nos. 6,025 f 155 and 6,077,697 provide heterochromatic artificial 
chromosomes designated therein as satellite artificial chromosomes 
(SATACs) and now designated artificial chromosome expression systems 

15 {ACes). These chromosomes are prepared by introducing heterologous 
DNA into a selected plant or animal cell under conditions that result in 
integration into a region of the chromosome that leads to an amplification 
event resulting in production of a dicentric chromosome. Subsequent 
treatment and growth of cells with dicentric chromosomes, including 

20 further amplifications, ultimately results in the artificial chromosomes 

provided therein. In order to introduce a desired heterologous gene (or a 
plurality of heterologous genes) into the artificial chromosome, the 
process is repeated introducing the desired heterologous genes and 
nucleic acids in the initial targeting step. This process is time consuming 

25 and tedious. Hence, more tractable and efficient methods for introducing 
heterologous nucleic acid molecules into artificial chromosomes, 
particularly ACes, are needed. 
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Therefore, it is an object herein to provide engineered artificial 
chromosomes that permit tractable / efficient and rational engineering of 
artificial chromosomes. 
SUMMARY OF THE INVENTION 
5 Provided herein are artificial chromosomes that permit tractable, 

efficient and rational engineering thereof. In particular, the artificial 
chromosomes provided herein contain one or a plurality of loci (sites) for 
site-specific, recombination-directed integration of DNA. Thus, provided 
herein are platform artificial chromosome expression systems ("platform 

10 ACes") containing single or multiple site-specific, recombination sites. 

The artificial chromosomes and ACes artificial chromosomes include plant 
and animal chromosomes. Any recombinase system that effects site- 
specific recombination is contemplated for use herein. 

In one embodiment, chromosomes, including platform ACes, are 

1 5 provided that contain one or more lambda att sites designed for 

recombination-directed integration in the presence of lambda integrase, 
and that are mutated so that they do not require additional factors. 
Methods for preparing such chromosomes, vectors for use in the 
methods, and uses of the resulting chromosomes are also provided. 

20 Platform ACes containing the recombination site(s) and methods for 

introducing heterologous nucleic acid into such sites and vectors therefor, 
are provided. 

Also provided herein is a bacteriophage lambda (A) integrase site- 
specific recombination system. 
25 Methods using recombinase mediated recombination target gene 

expression vectors and/or genes for insertion thereof into platform 
chromosomes and the resulting chromosomes are provided. 

Combinations and kits containing the combinations of vectors 
encoding a recombinase and integrase and primers for introduction of the 
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site recognized thereby are also provided. The kits optionally include 
instructions for performing site-directed integration or preparation of ACes 
containing such sites. 

Also provided herein are mammalian and plant cells comprising the 
5 artificial chromosomes and ACes described herein. The cells can be 
nuclear donor cells, stem cells, such as a mesenchymal stem cell, a 
hematopoietic stem cell, an adult stem cell or an embryonic stem cell. 

Also provided is a lamba-intR mutein comprising a glutamic acid to 
arginine change at position 174 of wild-type Iambda-integrase3. Also 
10 provided are transgenic animals and methods for producing a transgenic 
animal, comprising introducing a ACes into an embryonic cell, such as a 
stem cell or embryo. The ACes can comprise heterologous nucleic acid 
that encodes a therapeutic product. The transgenic animal can be a fish, 
insect, reptile, amphibians, arachnid or mammal. In certain embodiments, 
15 . the ACes is introduced by cell fusion, lipid-mediated transfection by a 
carrier system, microinjection, microcell fusion, electroporation, 
microprojectile bombardment or direct DNA transfer. 

The platform ACes, including plant and animal ACes, such as 
MACs, provided herein can be introduced into cells, such as, but not 
20 limited to, animal cells, including mammalian cells, and into plant cells. 
Hence plant cells that contain platform MACs, animal cells that contain 
platform PACs and other combinations of cells and platform ACes are 
provided. 

DESCRIPTION OF FIGURES 

25 FIGURE 1 provides a diagram depicting creation of an exemplary 

ACes artificial chromosome prepared using methods detailed in U.S. 
Patent Nos. 6,025,155 and 6,077,697 and International PCT application 
No. WO 97/40183. In this exemplified embodiment, the nucleic acid is 
targeted to an acrocentric chromosome in an animal or plant, and the 
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heterologous nucleic acid includes a sequence-specific recombination site 
and marker genes. 

FIGURE 2 provides a map of pWEPuro9K, which is a targeting 
vector derived from the vector pWE15 (GenBank Accession # X65279; 
5 SEQ ID No. 31). Plasmid pWE15 was modified by replacing the Sal\ 
(Klenow filled)/S/77al neomycin resistance encoding fragment with the 
PvuWIBamYW (Klenow filled) puromycin resistance-encoding fragment 
(isolated from plasmid pPUR, Clontech Laboratories, Inc., Palo Alto, CA; 
GenBank Accession no. U07648; SEQ ID No. 30) resulting in plasmid 
10 pWEPuro. Subsequently a 9 Kb Not) fragment from the plasmid pFK161 
(see Example 1 , see, also Csonka et al. (2000) Journal of Cell Science 
11 3.32.01 -321 61; and SEQ ID NO: 118), containing a portion of the 
mouse rDNA region, was cloned into the Not\ site of pWEPuro resulting in 
plasmid pWEPuro9K. 
15 FIGURE 3 depicts construction of an ACes platform chromosome 

with a single recombination site, such as loxP sites or an attP or attB site. 
This platform ACes chromosome is an exemplary artificial chromosome 
with a single recombination site. 

FIGURE 4 provides a map of plasmid pSV40-1 93attPsensePur. 
20 FIGURE 5 depicts a method for formation of a chromosome 

platform with multiple recombination integration sites, such as attP sites. 

FIGURE 6 sets forth the sequences of the core region of attP, attB, 
attL and attR (SEQ ID Nos. 33-36). 

FIGURE 7 depicts insertional recombination of a vector encoding a 
25 marker gene, DsRed and an attB site with an artificial chromosome 
containing an attP site. 

FIGURE 8 provides a map of plasmid pCXLamlntR (SEQ ID NO: 
112), which includes the Lambda integrase (E1 74R)-encoding nucleic 
acid. 
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FIGURE 9 diagrammatically summarizes the platform technology; 
marker 1 permits selection of the artificial chromosomes containing the 
integration site; marker 2, which is promoterless in the target gene 
expression vector, permits selection of recombinants. Upon 
5 recombination with the platform marker 2 is expressed under the control 
of a promoter resident on the platform. 

FIGURE 10 provides the vector map for the plasmid p18attBZEO- 
5'6XHS4eGFP (SEQ ID NO: 116). 

FIGURE 11 provides the vector map for the plasmid p18attBZEO- 
1 0 3'6XHS4eGFP (SEQ ID NO: 115). 

FIGURE 12 provides the vector map for the plasmid p18attBZEO- 
(6XHS4)2eGFP (SEQ ID NO: 110). 

FIGURES 13 AND 14 depict the integration of a PCR product by 
site-specific recombination as set forth in Example 8. 
15 FIGURE 15 provides the vector map for the plasmid pPACrDNA as 

set forth in Example 9. A. 

DETAILED DESCRIPTION OF THE INVENTION 
A. DEFINITIONS 

Unless defined otherwise, all technical and scientific terms used 
20 herein have the same meaning as is commonly understood by one of skill 
in the art to which the invention(s) belong. All patents, patent 
applications, published applications and publications, Genbank sequences, 
websites and other published materials referred to throughout the entire 
disclosure herein, unless noted otherwise, are incorporated by reference 
25 in their entirety. Where reference is made to a URL or other such 

indentifier or address, it understood that such identifiers can change and 
particular information on the internet can come and go, but equivalent 
information can be found by searching the internet. Reference thereto 
evidences the availability and public dissemination of such information. 
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As used herein, nucleic acid refers to single-stranded and/or 
double-stranded polynucleotides, such as deoxyribonucleic acid (DNA) 
and ribonucleic acid (RNA), as well as analogs or derivatives of either 
RNA or DNA. Also included in the term "nucleic acid" are analogs of 
5 nucleic acids such as peptide nucleic acid (PNA), phosphorothioate DNA, 
and other such analogs and derivatives. When referring to probes or 
primers, optionally labeled, with a detectable label, such as a fluorescent 
or radiolabel, single-stranded molecules are contemplated. Such 
molecules are typically of a length such that they are statistically unique 

10 and of low copy number (typically less than 5, preferably less than 3) for 
probing or priming a library. Generally a probe or primer contains at least 
14, 16 or 30 contiguous nucleotides of sequence complementary to or 
identical to a gene of interest. Probes and primers can be 10, 20, 30, 50, 
100 or more nucleotides long. 

1 5 As used herein, DNA is meant to include all types and sizes of DNA 

molecules including cDNA, plasmids and DNA including modified 
nucleotides and nucleotide analogs. 

As used herein, nucleotides include nucleoside mono-, di-, and 
triphosphates. Nucleotides also include modified-nucleotides, such as, 

20 but are not limited to, phosphorothioate nucleotides and deazapurine 
nucleotides and other nucleotide analogs. 

As used herein, heterologous or foreign DNA and RNA are used 
interchangeably and refer to DNA or RNA that does not occur naturally as 
part of the genome in which it is present or which is found in a location 

25 or locations and/or in amounts in a genome or cell that differ from that in 
which it occurs in nature. Heterologous nucleic acid is generally not 
endogenous to the cell into which it is introduced, but has been obtained 
from another cell or prepared synthetically. Generally, although not 
necessarily, such nucleic acid encodes RNA and proteins that are not 
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normally produced by the cell in which it is expressed. Any DNA or RNA 
that one of skill in the art would recognize or consider as heterologous or 
foreign to the cell in which it is expressed is herein encompassed by 
heterologous DNA. Heterologous DNA and RNA may also encode RNA or 
5 proteins that mediate or alter expression of endogenous DNA by affecting 
transcription, translation, or other regulatable biochemical processes. 

Examples of heterologous DNA include, but are not limited to, DNA 
that encodes a gene product or gene product(s) of interest, introduced for 
purposes of modification of the endogenous genes or for production of an 

10 encoded protein. For example, a heterologous or foreign gene may be 
isolated from a different species than that of the host genome, or 
alternatively, may be isolated from the host genome but operably linked 
to one or more regulatory regions which differ from those found in the 
unaltered, native gene. Other examples of heterologous DNA include, but 

1 5 are not limited to, DNA that encodes traceable marker proteins, such as a 
protein that confers traits including, but not limited to, herbicide, insect, 
or disease resistance; traits, including, but not limited to, oil quality or 
carbohydrate composition. Antibodies that are encoded by heterologous 
DNA may be secreted or expressed on the surface of the cell in which the 

20 heterologous DNA has been introduced. 

As used herein, operative linkage or operative association, or 
grammatical variations thereof, of heterologous DNA to regulatory and 
effector sequences of nucleotides, such as promoters, enhancers, 
transcriptional and translational stop sites, and other signal sequences 

25 refers to the relationship between such DNA and such sequences of 

nucleotides. For example, operative linkage of heterologous DNA to a 
promoter refers to the physical relationship between the DNA and the 
promoter such that the transcription of such DNA is initiated from the 
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promoter by an RNA polymerase that specifically recognizes, binds to and 
transcribes the DNA. 

In order to optimize expression and/or in vitro transcription, it may 
be necessary to remove, add or alter 5' untranslated portions of the 
5 clones to eliminate extra, potential inappropriate alternative translation 

initiation (i.e., start) codons or other sequences that may interfere with or 
reduce expression, either at the level of transcription or translation. 
Alternatively, consensus ribosome binding sites (see, e.g. , Kozak (1991) 
J. Biol. Chem. 266:19867-19870) can be inserted immediately 5' of the 

10 start codon and may enhance expression. 

As used herein, a sequence complementary to at least a portion of 
an RNA, with reference to antisense oligonucleotides, means a sequence 
having sufficient complementarity to be able to hybridize with the RNA, 
preferably under moderate or high stringency conditions, forming a stable 

15 duplex. The ability to hybridize depends on the degree of 

complementarity and the length of the antisense nucleic acid. The longer 
the hybridizing nucleic acid, the more base mismatches it can contain and 
still form a stable duplex (or triplex, as the case may be). One skilled in 
the art can ascertain a tolerable degree of mismatch by use of standard 

20 procedures to determine the melting point of the hybridized complex. 
As used herein, regulatory molecule refers to a polymer of 
deoxyribonucleic acid (DNA) or ribonucleic acid (RNA) or a polypeptide 
that is capable of enhancing or inhibiting expression of a gene. 

As used herein, recognition sequences are particular sequences of 

25 nucleotides that a protein, DNA, or RNA molecule, or combinations 
thereof, (such as, but not limited to, a restriction endonuclease, a 
modification methylase and a recombinase) recognizes and binds. For 
example, a recognition sequence for Cre recombinase (see, e.g., SEQ ID 
NO:58) is a 34 base pair sequence containing two 13 base pair inverted 
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repeats (serving as the recombinase binding sites) flanking an 8 base pair 
core and designated loxP (see, e.g., Sauer (1994) Current Opinion in 
Biotechnology 5:521-527). Other examples of recognition sequences, 
include, but are not limited to, attB and attP, attH and attL and others 
5 (see, e.g., SEQ ID Nos. 8, 41-56 and 72), that are recognized by the 
recombinase enzyme Integrase (see, SEQ ID Nos. 37 and 38 for the 
nucleotide and encoded amino acid sequences of an exemplary lambda 
phage integrase). 

The recombination site designated attB is an approximately 33 base 
10 pair sequence containing two 9 base pair core-type Int binding sites and a 
7 base pair overlap region; attP (SEQ ID No. 72) is an approximately 240 
base pair sequence containing core-type Int binding sites and arm-type Int 
binding sites as well as sites for auxiliary proteins IHF, FIS, and Xis (see, 
e.g., Landy (1993) Current Opinion in Biotechnology 5:699-7071 see, 
15 e.g., SEQ ID Nos. 8 and 72). 

As used herein, a recombinase is an enzyme that catalyzes the 
exchange of DNA segments at specific recombination sites. An integrase 
herein refers to a recombinase that is a member of the lambda (A) 
integrase family. 

20 As used herein, recombination proteins include excisive proteins, 

integrative proteins, enzymes, co-factors and associated proteins that are 
involved in recombination reactions using one or more recombination sites 
(see, Landy (1993) Current Opinion in Biotechnology 3:699-707). The 
recombination proteins used herein can be delivered to a cell via an 

25 expression cassette on an appropriate vector, such as a plasmid, and the 
like. In other embodiments, the recombination proteins can be delivered 
to a cell in protein form in the same reaction mixture used to deliver the 
desired nucleic acid, such as a platform ACes, donor target vectors, and 
the like. 




-12- 

As used herein the expression "lox site" means a sequence of 
nucleotides at which the gene product of the ere gene, referred to 
herein as Cre, can catalyze a site-specific recombination event. A LoxP 
site is a 34 base pair nucleotide sequence from bacteriophage P1 (see, 
5 e.g., Hoess eta/. (1982) Proc. Natl. Acad. Sci. U.S.A. 73:3398-3402). 
The LoxP site contains two 1 3 base pair inverted repeats separated by an 
8 base pair spacer region as follows: (SEQ ID NO. 57): 

ATAACTTCGTATA ATGTATGC TATACGAAGTTAT 
E. co//DH5Alac and yeast strain BSY23 transformed with plasmid pBS44 

1 0 carrying two loxP sites connected with a LEU2 gene are available from 
the American Type Culture Collection (ATCC) under accession numbers 
ATCC 53254 and ATCC 20773, respectively. The lox sites can be 
isolated from plasmid pBS44 with restriction enzymes EcoFU and Sal\, or 
Xho\ and BamHl. In addition, a preselected DNA segment can be inserted 

15 into pBS44 at either the SaA or BamYW restriction enzyme sites. Other lox 
sites include, but are not limited to, LoxB, LoxL, LoxC2 and LoxR sites, 
which are nucleotide sequences isolated from E. co/i (see, e.g., Hoess et 
at. (1982) Proc. Natl. Acad. Sci. U.S.A. 73:3398). Lox sites can also be 
produced by a variety of synthetic techniques {see, e.g., Ito et aL (1982) 

20 Nuc. Acid Res. ZO;1755 and Ogilvie et al. (1981) Science 270:270). 

As used herein, the expression "cre gene" means a sequence of 
nucleotides that encodes a gene product that effects site-specific 
recombination of DNA in eukaryotic cells at lox sites. One cre gene can 
be isolated from bacteriophage P1 (see, e.g., Abremski et al. (1983) Cell 

25 .32:1301-131 1). E. colt DH1 and yeast strain BSY90 transformed with 
plasmid pBS39 carrying a cre gene isolated from bacteriophage P1 and a 
GAL1 regulatory nucleotide sequence are available from the American 
Type Culture Collection (ATCC) under accession numbers ATCC 53255 
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and ATCC 20772, respectively. The ere gene can be isolated from 
plasmid pBS39 with restriction enzymes Xho\ and Sal\. 

As used herein, site-specific recombination refers to site-specific 
recombination that is effected between two specific sites on a single 
5 nucleic acid molecule or between two different molecules that requires 
the presence of an exogenous protein, such as an integrase or 
recornbinase. 

For example, Cre-lox site-specific recombination can include the 
following three events: 
10 a. deletion of a pre-selected DNA segment flanked by lox 

sites; 

b. inversion of the nucleotide sequence of a pre-selected 
DNA segment flanked by lox sites; and 

c. reciprocal exchange of DNA segments proximate to 
15 lox sites located on different DNA molecules. 

This reciprocal exchange of DNA segments can result in an 
integration event if one or both of the DNA molecules are circular. DNA 
segment refers to a linear fragment of single- or double-stranded 
deoxyribonucleic acid (DNA), which can be derived from any source. 

20 Since the lox site is an asymmetrical nucleotide sequence, two lox sites 
on the same DNA molecule can have the same or opposite orientations 
with respect to each other. Recombination between lox sites in the same 
orientation results in a deletion of the DNA segment located between the 
two lox sites and a connection between the resulting ends of the original 

25 DNA molecule. The deleted DNA segment forms a circular molecule of 

DNA. The original DNA molecule and the resulting circular molecule each 
contain a single lox site. Recombination between lox sites in opposite 
orientations on the same DNA molecule result in an inversion of the 
nucleotide sequence of the DNA segment located between the two lox 
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sites. In addition, reciprocal exchange of DNA segments proximate to lox 
sites located on two different DNA molecules can occur. All of these 
recombination events are catalyzed by the gene product of the ere gene. 
Thus, the Cre-lox system can be used to specifically delete, invert, or 
5 insert DNA. The precise event is controlled by the orientation of lox DNA 
sequences, in c/s the lox sequences direct the Cre recombinase to either 
delete (lox sequences in direct orientation) or invert (lox sequences in 
inverted orientation) DNA flanked by the sequences, while in trans the lox 
sequences can direct a homologous recombination event resulting in the 

10 insertion of a recombinant DNA. 

As used herein, a chromosome is a nucleic acid molecule, and 
associated proteins, that is capable of replication and segregation within a 
cell upon cell division. Typically, a chromosome contains a centromeric 
region, replication origins, telomeric regions and a region of nucleic acid 

15 between the centromeric and telomeric regions. 

As used herein, a centromere is any nucleic acid sequence that 
confers an ability to segregate to daughter cells through cell division. A 
centromere may confer stable segregation of a nucleic acid sequence, 
including an artificial chromosome containing the centromere, through 

20 mitotic or meiotic divisions, including through both mitotic and meiotic 
divisions. A particular centromere is not necessarily derived from the 
same species in which it is introduced, but has the ability to promote 
DNA segregation in cells of that species. 

As used herein, euchromatin and heterochromatin have their 

25 recognized meanings. Euchromatin refers to chromatin that stains 

diffusely and that typically contains genes, and heterochromatin refers to 
chromatin that remains unusually condensed and that has been thought to 
be transcriptionally inactive. Highly repetitive DNA sequences (satellite 
DNA) are usually located in regions of the heterochromatin surrounding 
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the centromere {pericentric or pericentromeric heterochromatin). 
Constitutive heterochromatin refers to heterochromatin that contains the 
highly repetitive DNA which is constitutively condensed and genetically 
inactive. 

5 As used herein, an acrocentric chromosome refers to a 

chromosome with arms of unequal length. 

As used herein, endogenous chromosomes refer to genomic chrom- 
osomes as found in a cell prior to generation or introduction of an artificial 
chromosome. 

10 As used herein, artificial chromosomes are nucleic acid molecules, 

typically DNA, that stably replicate and segregate alongside endogenous 
chromosomes in cells and have the capacity to accommodate and express 
heterologous genes contained therein. It has the capacity to act as a 
gene delivery vehicle by accommodating and expressing foreign genes 

15 contained therein. A mammalian artificial chromosome (MAC) refers to 
chromosomes that have an active mammalian centromere(s). Plant 
artificial chromosomes, insect artificial chromosomes and avian artificial 
chromosomes refer to chromosomes that include centromeres that 
function in plant, insect and avian cells, respectively. A human artificial 

20 chromosome (HAC) refers to chromosomes that include centromeres that 
function in human cells. For exemplary artificial chromosomes, see, e.g. , 
U.S. Patent Nos. 6,025,155; 6,077,697; 5,288,625; 5,712,134; 
5,695,967; 5,869,294; 5,891,691 and 5,721,118 and published 
International PCT application Nos, WO 97/40183 and WO 98/08964. 

25 Artificial chromosomes include those that are predominantly 

heterochromatic (formerly referred to as satellite artificial chromosomes 
(SATACs); see, e.g., U.S. Patent Nos. 6,077,697 and 6,025,155 and 
published International PCT application No. WO 97/40183), 
minichromosomes that contain a de novo centromere (see, U.S. Patent 
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Nos. 5,712,134 # 5,891,691 and 5,288,625), artificial chromosomes 
predominantly made up of repeating nucleic acid units and that contain 
substantially equivalent amounts of euchromatic and heterochromatic 
DNA and in vitro assembled artificial chromosomes (see, copending U.S. 
5 provisional application Serial No. 60/294,687, filed on May 30, 2001). 
As used herein, the term "satellite DNA-based artificial 
chromosome (SATAC)" is interchangable with the term "artificial 
chromosome expression system (ACes)" . These artificial chromosomes 
(ACes) include those that are substantially all neutral non-coding 

10 sequences (heterochromatin) except for foreign heterologous, typically 
gene-encoding nucleic acid, that is interspersed within the 
heterochromatin for the expression therein (see U.S. Patent Nos. 
6,025,155 and 6,077,697 and International PCT application No. WO 
97/40183), or that is in a single locus as provided herein. Also included 

1 5 are ACes that may include euchromatin and that result from the process 
described in U.S. Patent Nos. 6,025,155 and 6,077,697 and International 
PCT application No. WO 97/40183 and outlined herein. The delineating 
structural feature is the presence of repeating units, that are generally 
predominantly heterochromatin. The precise structure of the ACes will 

20 depend upon the structure of the chromosome in which the initial 

amplification event occurs; all share the common feature of including a 
defined pattern of repeating units. Generally ACes have more 
heterochromatin than euchromatin. Foreign nucleic acid molecules 
(heterologous genes) contained in these artificial chromosome expression 

25 systems can include any nucleic acid whose expression is of interest in a 
particular host cell. Such foreign nucleic acid molecules, include, but are 
not limited to, nucleic acid that encodes traceable marker proteins 
(reporter genes), such as fluorescent proteins, such as green, blue or red 
fluorescent proteins (GFP, BFP and RFP, respectively), other reporter 
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genes, such as /?-galactosidase and proteins that confer drug resistance, 
such as a gene encoding hygromycin-resistance. Other examples of 
heterologous nucleic acid molecules include, but are not limited to, DNA 
that encodes therapeutically effective substances, such as anti-cancer 
5 agents, enzymes and hormones, DNA that encodes other types of 

proteins, such as antibodies, and DNA that encodes RNA molecules (such 
as antisense or siRNA molecules) that are not translated into proteins. 

As used herein, an artificial chromosome platform, also referred to 

10 herein as a "platform ACes" or "ACes platform", refers to an artificial 
chromosome that has been engineered to include one or more sites for 
site-specific, recombination-directed integration. In particular, ACes that 
are so-engineered are provided. Any sites, including but not limited to 
any described herein, that are suitable for such integration are 

15 contemplated. Plant and animal platform ACes are provided. Among the 
ACes contemplated herein are those that are predominantly 
heterochromatic (formerly referred to as satellite artificial chromosomes 
(SATACs); see, e.g., U.S. Patent Nos. 6,077,697 and 6,025,155 and 
published International PCT application No. WO 97/40183), artificial 

20 chromosomes predominantly made up of repeating nucleic acid units and 
that contain substantially equivalent amounts of euchromatic and 
heterochromatic DNA resulting from an amplification event depicted in the 
referenced patent and herein. Included among the ACes for use in 
generating platforms, are artificial chromosomes that introduce and 

25 express heterologous nucleic acids in plants (see, copending U.S. 

provisional application Serial No. 60/294,687, filed on May 30, 2001). 
These include artificial chromosomes that have a centromere derived from 
a plant, and, also, artificial chromosomes that have centromeres that may 
be derived from other organisms but that function in plants. 



-18- 

As used herein a "reporter ACes" refers to an ACes that comprises 
one or a plurality of reporter constructs, where the reporter construct 
comprises a reporter gene in operative linkage with a regulatory region 
responsive to test or known compounds. 
5 As used herein, amplification, with reference to DNA, is a process 

in which segments of DNA are duplicated to yield two or multiple copies 
of substantially similar or identical or nearly identical DNA segments that 
are typically joined as substantially tandem or successive repeats or 
inverted repeats. 

10 As used herein, amplification-based artificial chromosomes are 

artificial chromosomes derived from natural or endogenous chromosomes 
by virtue of an amplification event, such as one initiated by introduction 
of heterologous nucleic acid into rDNA in a chromosome. As a result of 
such an event, chromosomes and fragments thereof exhibiting segmented 

15 or repeating patterns arise. Artificial chromosomes can be formed from 
these chromosomes and fragments. Hence, amplification-based artificial 
chromosomes refer to engineered chromosomes that exhibit an ordered 
segmentation that is not observed in naturally occurring chromosomes 
and that distinguishes them from naturally occurring chromosomes. The 

20 segmentation, which can be visualized using a variety of chromosome 

analysis techniques known to those of skill in the art, correlates with the 
structure of these artificial chromosomes. In addition to containing one or 
more centromeres, the amplification-based artificial chromosomes, 
throughout the region or regions of segmentation are predominantly made 

25 up of nucleic acid units also referred to as "amplicons", that is (are) 

repeated in the region and that have a similar gross structure. Repeats of 
an amplicon tend to be of similar size and share some common nucleic 
acid sequences. For example, each repeat of an amplicon may contain a 
replication site involved in amplification of chromosome segments and/or 
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some heterologous nucleic acid that was utilized in the initial production 
of the artificial chromosome. Typically, the repeating units are 
substantially similar in nucleic acid composition and may be nearly 
identical. 

5 The amplification-based artificial chromosomes differ depending on 

the chromosomal region that has undergone amplification in the process 
of artificial chromosome formation. The structures of the resulting 
chromosomes can vary depending upon the initiating event and/or the 
conditions under which the heterologous nucleic acid is introduced, 

10 including modification to the endogenous chromosomes. For example, in 
some of the artificial chromosomes provided herein, the region or regions 
of segmentation may be made up predominantly of heterochromatic DNA. 
In other artificial chromosomes provided herein, the region or regions of 
segmentation may be made up predominantly of euchromatic DNA or may 

15 be made up of similar amounts of heterochromatic and euchromatic DNA. 
As used herein an amplicon is a repeated nucleic acid unit. In 
some of the artificial chromosomes described herein, an amplicon may 
contain a set of inverted repeats of a megareplicon. A megareplicon 
represents a higher order replication unit. For example, with reference to 

20 some of the predominantly heterochromatic artificial chromosomes, the 
megareplicon can contain a set of tandem DNA blocks (e.g., ~7.5 Mb 
DNA blocks) each containing satellite DNA flanked by non-satellite DNA 
or may be made up of substantially rDNA. Contained within the 
megareplicon is a primary replication site, referred to as the 

25 megareplicator, which may be involved in organizing and facilitating 
replication of the pericentric heterochrornatin and possibly the 
centromeres. Within the megareplicon there may be smaller {e.g., 50-300 
kb) secondary replicons. 
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In artificial chromosomes, such as those provided U.S. Patent Nos. 
6,025,155 and 6,077,697 and International PCT application No. WO 
97/40183, the megareplicon is defined by two tandem blocks { — 7.5 Mb 
DNA blocks in the chromosomes provided therein). Within each artificial 
5 chromosome or among a population thereof, each amplicon has the same 
gross structure but may contain sequence variations. Such variations will 
arise as a result of movement of mobile genetic elements, deletions or 
insertions or mutations that arise, particularly in culture. Such variation 
does not affect the use of the artificial chromosomes or their overall 

10 structure as described herein. 

As used herein, amplifiable, when used in reference to a 
chromosome, particularly the method of generating artificial chromosomes 
provided herein, refers to a region of a chromosome that is prone to 
amplification. Amplification typically occurs during replication and other 

15 cellular events involving recombination {e.g., DNA repair). Such regions 
include regions of the chromosome that contain tandem repeats, such as 
satellite DNA, rDNA, and other such sequences. 

As used herein, a dicentric chromosome is a chromosome that 
contains two centromeres. A multicentric chromosome contains more 

20 than two centromeres. 

As used herein, a formerly dicentric chromosome is a chromosome 
that is produced when a dicentric chromosome fragments and acquires 
new telomeres so that two chromosomes, each having one of the 
centromeres, are produced. Each of the fragments is a replicable 

25 chromosome. If one of the chromosomes undergoes amplification of 

primarily euchromatic DNA to produce a fully functional chromosome that 
is predominantly (at least more than 50%) euchromatin, it is a 
minichrompsome. The remaining chromosome is a formerly dicentric 
chromosome. If one of the chromosomes undergoes amplification, 
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whereby heterochromatin (such as, for example, satellite DNA) is 
amplified and a euchromatic portion (such as, for example, an arm) 
remains, it is referred to as a sausage chromosome. A chromosome that 
is substantially all heterochromatin, except for portions of heterologous 
5 DNA, is called a predominantly heterochromatic artificial chromosome. 
Predominantly heterochromatic artificial chromosomes can be produced 
from other partially heterochromatic artificial chromosomes by culturing 
the cell containing such chromosomes under conditions such as. BrdU 
treatment that destabilize the chromosome and/or growth under selective 

1 0 conditions so that a predominantly heterochromatic artificial chromosome 
is produced. For purposes herein, it is understood that the artificial 
chromosomes may not necessarily be produced in multiple steps, but may 
appear after the initial introduction of the heterologous DNA. Typically, 
artificial chromosomes appear after about 5 to about 60, or about 5 to 

15 about 55, or about 10 to about 55 or about 25 to about 55 or about 35 
to about 55 cell doublings after initiation of artificial chromosome 
generation, or they may appear after several cycles of growth under 
selective conditions and BrdU treatment. 

As used herein, an artificial chromosome that is predominantly 

20 heterochromatic (i.e., containing more heterochromatin than euchromatin, 
typically more than about 50%, more than about 70%, or more than 
about 90% heterochromatin) may be produced by introducing nucleic acid 
molecules into cells, such as, for example, animal or plant cells, and 
selecting cells that contain a predominantly heterochromatic artificial 

25 chromosome. Any nucleic acid may be introduced into cells in such 
methods of producing the artificial chromosomes. For example, the 
nucleic acid may contain a selectable marker and/or optionally a sequence 
that targets nucleic acid to the pericentric, heterochromatic region of a 
chromosome, such as in the short arm of acrocentric chromosomes and 
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nucleolar organizing regions. Targeting sequences include, but are not 
limited to, lambda phage DNA and rDNA for production of predominantly 
heterochromatic artificial chromosomes in eukaryotic cells. 

After introducing the nucleic acid into cells, a cell containing a 
5 predominantly heterochromatic artificial chromosome is selected. Such 
cells may be identified using a variety of procedures. For example, 
repeating units of heterochromatic DNA of these chromosomes may be 
discerned by G-banding and/or fluorescence in situ hybridization (FISH) 
techniques. Prior to such analyses, the cells to be analyzed may be 

10 enriched with artificial chromosome-containing cells by sorting the cells 
on the basis of the presence of a selectable marker, such as a reporter 
protein, or by growing (culturing) the cells under selective conditions. It 
is also possible, after introduction of nucleic acids into cells, to select 
cells that have a multicentric, typically dicentric, chromosome, a formerly 

1 5 multicentric (typically dicentric) chromosome and/or various 

heterochromatic structures, such as a megachromosome and a sausage 
chromosome, that contain a centromere and are predominantly 
heterochromatic and to treat them such that desired artificial 
chromosomes are produced. Cells containing a new chromosome are 

20 selected. Conditions for generation of a desired structure include, but are 
not limited to, further growth under selective conditions, introduction of 
additional nucleic acid molecules and/or growth under selective conditions 
and treatment with destabilizing agents, and other such methods (see 
International PCT application No. WO 97/40183 and U.S. Patent Nos. 

25 6,025,155 and 6,077,697). 

As used herein, a "selectable marker" is a nucleic acid segment, 
generally DNA, that allows one to select for or against a molecule or a 
cell that contains it, often under particular conditions. These markers can 
encode an activity, such as, but not limited to, production of RNA, 
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peptide, or protein, or can provide a binding site for RNA, peptides, 
proteins, inorganic and organic compounds and compositions. Examples 
of selectable markers include but are not limited to: (1) nucleic acid 
segments that encode products that provide resistance against otherwise 
5 toxic compounds (e.g., antibiotics); (2) nucleic acid segments that encode 
products that are otherwise lacking in the recipient cell (e.g., tRNA genes, 
auxotrophic markers); (3) nucleic acid segments that encode products 
that suppress the activity of a gene product; (4) nucleic acid segments 
that encode products that can be identified, such as phenotypic markers, 

10 including ^-galactosidase, red, blue and/or green fluorescent proteins 
(FPs), and cell surface proteins; (5) nucleic acid segments that bind 
products that are otherwise detrimental to cell survival and/or function; 
(6) nucleic acid segments that otherwise inhibit the activity of any of the 
nucleic acid segments described in Nos. 1-5 above (e.g., antisense 

15 oligonucleotides or siRNA molecules for use in RNA interference); (7) 
nucleic acid segments that bind products that modify a substrate (e.g. 
restriction endonucleases); (8) nucleic acid segments that can be used to 
isolate a desired molecule (e.g. specific protein binding sites); (9) nucleic 
acid segments that encode a specific nucleotide sequence that can be 

20 otherwise non-functional, such as for PCR amplification of subpopulations 
of molecules; and/or (10) nucleic acid segments, which when absent, 
directly or indirectly confer sensitivity to particular compounds. Thus, for 
example, selectable markers include nucleic acids encoding fluorescent 
proteins, such as green fluorescent proteins, /?-galactosidase and other 

25 readily detectable proteins, such as chromogenic proteins or proteins 
capable of being bound by an antibody and FACs sorted. Selectable 
markers such as these, which are not required for cell survival and/or 
proliferation in the presence of a selection agent, are also referred to 
herein as reporter molecules. Other selectable markers, e.g. , the 
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neomycin phosphotransferase gene, provide for isolation and identification 
of cells containing them by conferring properties on the cells that make 
them resistant to an agent, e.g., a drug such as an antibiotic, that inhibits 
proliferation of cells that do not contain the marker. 
5 As another example, interference of gene expression by double 

stranded RNA has been shown in Caenorhabditis elegans, plants, 
Drosophila, protozoans and mammals. This method is known as RNA 
interference (RNAi) and utilizes short, double-stranded RNA molecules 
(siRNAs). The siRNAs are generally composed of a 1 9-22bp double- 

10 stranded RNA stem, a loop region and a 1-4 bp overhang on the 3' end. 
The reduction of gene expression has been accomplished by direct 
introduction of the siRNAs into the cell (Harborth J et al., 2001 , J Cell Sci 
1 14(pt 24):4557-65) as well as the introduction of DNA encoding and 
expressing the siRNA molecule. The encoded siRNA molecules are under 

15 the regulation of an RNA polymerase III promoter (see, e.g., Yu et al>,._ 
2002, Proc Natl Acad Sci USA 99(9);6047-52; Brummelkamp et al., 
2002, Science 296(5567) :550-3; Miyagishi et al., 2002, Nat Biotechnol 
20(5):497-500; and the like). In certain embodiments, RNAi in 
mammalian cells may have advantages over other therapeutic methods. 

20 For example, producing siRNA molecules that block viral genetic activities 
in infected cells may reduce the effects of the virus. Platform ACes 
provided herein encoding siRNA molecule(s) are an additional utilization of 
the platform ACes technology. The platform ACes could be engineered to 
encode one or more siRNA molecules to create gene "knockdowns". In 

25 one embodiment, a platform ACes can be engineered to encode both the 
siRNA molecule and a replacement gene. For example, a mouse model or 
cell culture system could be generated using a platform ACes that has a 
knockdown of the endogenous mouse gene, by siRNA, and the human 
gene homolog expressing in place of the mouse gene. The placement of 
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siRNA encoding sequences under the regulation of a regulatable or 
inducible promoter would allow one to temporally and/or spatially control 
the knockdown effect of the corresponding gene. 

As used herein, a reporter gene includes any gene that expresses a 
5 detectable gene product, which may be RNA or protein. Generally 

reporter genes are readily detectable. Examples of reporter genes include, 
but are not limited to nucleic acid encoding a fluorescent protein, CAT 
(chloramphenicol acetyl transferase) (Alton et al. (1979) Nature 282: 864- 
869) luciferase, and other enzyme detection systems, such as beta- 
lO galactosidase; firefly luciferase (deWet eta/. (1987) Mol. Cell. Biol. 

7:72.5-737); bacterial luciferase (Engebrecht and Silverman (1984) Proc. 
Natl. Acad. Sci. U.S.A. 87:41 54-41 58; Baldwin et al. (1984) 
Biochemistry 23:3663-3667); and alkaline phosphatase (Toh et al. (1989) 
Eur. J. Biochem. 752:231-238, Hall et al. (1983) J. MoL Appl. Gen. 
15 2:101). 

As used herein, growth under selective conditions means growth of 
a cell under conditions that require expression of a selectable marker for 
survival. 

As used herein, an agent that destabilizes a chromosome is any 
20 agent known by those skilled in the art to enhance amplification events, 
and/or mutations. Such agents, which include BrdU, are well known to 
those skilled in the art. 

In order to generate an artificial chromosome containing a particular 
heterologous nucleic acid of interest, it is possible to include the nucleic 
25 acid in the nucleic acid that is being introduced into cells to initiate 

production of the artificial chromosome. Thus, for example, a nucleic 
acid can be introduced into a cell along with nucleic acid encoding a 
selectable marker and/or a nucleic acid that targets to a heterochromatic 
region of a chromosome. For introducing a heterologous nucleic acid into 
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the cell, it can be included in a fragment that includes a selectable marker 
or as part of a separate nucleic acid fragment and introduced into the cell 
with a selectable marker during the process of generating the artificial 
chromosomes. Alternatively, heterologous nucleic acid can be introduced 
5 into an artificial chromosome at a later time after the initial generation of 
the artificial chromosome. 

As used herein, the minichromosome refers to a chromosome 
derived from a multicentric, typically dicentric, chromosome that contains 
more euchromatic than heterochromatic DNA. For purposes herein, the 

10 minichromosome contains a de novo centromere (e.g., a neocentromere). 
In some embodiments, for example, the minichromosome contains a 
centromere that replicates in animals, e.g., a mammalian centromere or in 
plants, e.g., a plant centromere. 

As used herein, in vitro assembled artificial chromosomes or 

1 5 synthetic chromosomes can be either more euchromatic than 

heterochromatic or more heterochromatic than euchromatic and are 
produced by joining essential components of a chromosome in vitro. 
These components include at least a centromere, a megareplicator, a 
telomere and optionally secondary origins of replication. 

20 As used herein, in vitro assembled plant or animal artificial 

chromosomes are produced by joining essential components (at least the 
centromere, telomere(s), megareplicator and optional secondary origins of 
replication) that function in plants or animals. In particular embodiments, 
the megareplicator contains sequences of rDNA, particularly plant or 

25 animal rDNA. 

As used herein, a plant is a eukaryotic organism that contains, in 
addition to a nucleus and mitochondria, chloroplasts capable of carrying 
out photosynthesis. A plant can be unicellular or multicellular and can 
contain multiple tissues and/or organs. Plants can reproduce sexually or 




-27- 

asexually and can be perennial or annual in growth. Plants can also be 
terrestrial or aquatic. The term "plant" includes a whole plant, plant cell, 
plant protoplast, plant calli, plant seed, plant organ, plant tissue, and 
other parts of a whole plant. 
5 As used herein, stable maintenance of chromosomes occurs when 

at least about 85%, preferably 90%, more preferably 95%, of the cells 
retain the chromosome. Stability is measured in the presence of a 
selective agent. Preferably these chromosomes are also maintained in the 
absence of a selective agent. Stable chromosomes also retain their 

10 structure during cell cuituring, suffering no unintended intrachromosomal 
or interchromosomal rearrangements. 

As used herein, de novo with reference to a centromere, refers to 
generation of an excess centromere in a chromosome as a result of 
incorporation of a heterologous nucleic acid fragment using the methods 

15 herein. 

As used herein, BrdU refers to 5-bromodeoxyuridine, which during 
replication is inserted in place of thymidine. BrdU is used as a mutagen; it 
also inhibits condensation of metaphase chromosomes during cell 
division. 

20 As used herein, ribosomal RNA (rRNA) is the specialized RNA that 

forms part of the structure of a ribosome and participates in the synthesis 
of proteins. Ribosomal RNA is produced by transcription of genes which, 
in eukaryotic cells, are present in multiple copies. In human cells, the 
approximately 250 copies of rRNA genes (i.e., genes which encode rRNA) 

25 per haploid genome are spread out in clusters on at least five different 

chromosomes (chromosomes 13, 14, 15, 21 and 22). In mouse cells, the 
presence of ribosomal DNA (rDNA, which is DNA containing sequences 
that encode rRNA) has been verified on at least 1 1 pairs out of 20 mouse 
chromosomes (chromosomes 5, 6, 7, 9, 11, 12, 15, 16, 17, 18, and 19) 
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(see e.g., Rowe eta/. (1996) Mamm. Genome 7:886-889 and Johnson et 
aL (1993) Mamm. Genome -4:49-52). In Arabidopsis thaliana the 
presence of rDNA has been verified on chromosomes 2 and 4 (18S, 5.8S, 
and 25S rDNA) and on chromosomes 3,4, and 5 (5S rDNA) (see The 
5 Arabidopsis Genome Initiative (2000) Nature 408:796-815). In 

eukaryotic ceils, the multiple copies of the highly conserved rRNA genes 
are located in a tandemly arranged series of rDNA units, which are 
generally about 40-45 kb in length and contain a transcribed region and a 
nontranscribed region known as spacer (i.e., intergenic spacer) DNA 

10 which can vary in length and sequence. In the human and mouse, these 
tandem arrays of rDNA units are located adjacent to the pericentric 
satellite DNA sequences (heterochromatin). The regions of these 
chromosomes in which the rDNA is located are referred to as nucleolar 
organizing regions (NOR) which loop into the nucleolus, the site of 

15 n'bosorne production within the cell nucleus. 

As used herein, a megachromosome refers to a chromosome that, 
except for introduced heterologous DNA, is substantially composed of 
heterochromatin. Megachromosomes are made up of an array of repeated 
amplicons that contain two inverted megareplicons bordered by 

20 introduced heterologous DNA (see, e.g., Figure 3 of U.S. Patent No. 
6,077,697 for a schematic drawing of a megachromosome). For 
purposes herein, a megachromosome is about 50 to 400 Mb, generally 
about 250-400 Mb. Shorter variants are also referred to as truncated 
megachromosomes (about 90 to 120 or 1 50 Mb), dwarf 

25 megachromosomes (~ 1 5O-2O0 Mb), and a micro-megachromosome 
(~ 50-90 Mb, typically 50-60 Mb). For purposes herein, the term 
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megachromosome refers to the overall repeated structure based on an 
array of repeated chromosomal segments (amplicons) that contain two 
inverted megareplicons bordered by any inserted heterologous DNA. The 
size will be specified. 
5 As used herein, gene therapy involves the transfer or insertion of 

nucleic acid molecules into certain cells, which are also referred to as 
target cells, to produce specific products that are involved in preventing, 
curing, correcting, controlling or modulating diseases, disorders and 
deleterious conditions. The nucleic acid is introduced into the selected 

10 target cells in a manner such that the nucleic acid is expressed and a 

product encoded thereby is produced. Alternatively, the nucleic acid may 
in some manner mediate expression of DNA that encodes a therapeutic 
product. This product may be a therapeutic compound, which is 
produced in therapeutically effective amounts or at a therapeutically 

15 useful time. It may also encode a product, such as a peptide or RNA, 
that in some manner mediates, directly or indirectly, expression of a 
therapeutic product. Expression of the nucleic acid by the target cells 
within an organism afflicted with a disease or disorder thereby provides 
for modulation of the disease or disorder. The nucleic acid encoding the 

20 therapeutic product may be modified prior to introduction into the cells of 
the afflicted host in order to enhance or otherwise alter the product or 
expression thereof. 

For use in gene therapy, cells can be transfected in vitro, followed 
by introduction of the transfected cells into an organism. This is often 

25 referred to as ex vivo gene therapy. Alternatively, the cells can be 
transfected directly in vivo within an organism. 

As used herein, therapeutic agents include, but are not limited to, 
growth factors, antibodies, cytokines, such as tumor necrosis factors and 
interleukins, and cytotoxic agents and other agents disclosed herein and 
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known to those of skill in the art. Such agents include, but are not 
limited to, tumor necrosis factor, a-interferon, ^-interferon, nerve growth 
factor, platelet derived growth factor, tissue plasminogen activator; or, 
biological response modifiers such as, for example, lymphokines, 
interleukin- I (IL-1), interleukin-2 (IL-2), interleukin-6 (1L-6), granulocyte 
macrophage colony stimulating factor (GMCSF), granulocyte colony 
stimulating factor (G-CSF), erythropoietin (EPO), pro-coagulants such as 
tissue factor and tissue factor variants, pro-apoptotic agents such FAS- 
ligand, fibroblast growth factors (FGF), nerve growth factor and other 
growth factors- 

As used herein, a therapeutically effective product is a product that 
is encoded by heterologous DNA that, upon introduction of the DNA into 
a host, a product is expressed that effectively ameliorates or eliminates 
the symptoms, manifestations of an inherited or acquired disease or that 

cures the disease. 

As used herein, transgenic plants and animals refer to plants and 
animals in which heterologous or foreign nucleic acid is expressed or in 
which the expression of a gene naturally present in the plant or animal 
has been altered by virtue of introduction of heterologous or foreign 
nucleic acid. 

As used herein, IRES (internal ribosome entry site; see, e.g., SEQ 
ID No. 27 and nucleotides 2736-3308 SEQ ID No. 28) refers to a region 
of a nucleic acid molecule, such as an mRNA molecule, that allows 
internal ribosome entry sufficient to initiate translation, which initiation 
can be detected in an assay for cap-independent translation (see, e.g.. 
U.S. Patent No. 6,171 ,821). The presence of an IRES within an mRNA 
molecule allows cap-independent translation of a linked protein-encoding 
sequence that otherwise would not be translated. 
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Internal ribosorne entry site (IRES) elements were first identified in 
picornaviruses, which elements are considered the paradigm for cap- 
independent translation. The 5' UTRs of all picornaviruses are long and 
mediate translational initiation by directly recruiting and binding 
5 ribosomes, thereby circumventing the initial cap-binding step. IRES 
elements are frequently found in viral mRNA, they are rare in non-viral 
mRNA. Among non-viral mRNA molecules that contain functional IRES 
elements in their respective 5' UTRs are those encoding immunoglobulin 
heavy chain binding protein (BiP) (Macejak et aL (1991) Nature 
10 353:90-94); Drosophila Antennapedia (Oh et aL (1992) Genes Dev, 
6A 643-1 653); D. Ultrabithorax (Ye et aL (1997) MoL Cell BioL 
17:1714-21); fibroblast growth factor 2 (Vagner et aL (1995) MoL Cell 
BioL 75:35-44); initiation factor elF4G (Gan et aL (1998) J. BioL Chem. 
273:5006-5012); proto-oncogene c-myc (Nanbru et aL (1995) J. BioL 
15 Chem. 272:32061-32066; Stoneley (1998) Oncogene 75:423-428); 

IRES H ; from the 5'UTR of NRF1 gene (Oumard et al. (2000) MoL and Cell 
BioL, 20(8):2755-2759); and vascular endothelial growth factor (VEGF) 
(Stein etaL (1998) MoL Cell BioL 73:3112-9). 

As used herein, a promoter, with respect to a region of DNA, refers 
20 to a sequence of DNA that contains a sequence of bases that signals RNA 
polymerase to associate with the DNA and initiate transcription of RNA 
(such as pol II for mRNA) from a template strand of the DNA. A promoter 
thus generally regulates transcription of DNA into mRNA. A particular 
promoter provided herein is the Ferritin heavy chain promoter (excluding 
25 the Iron Response Element, located in the 5'UTR), which was joined to 
the 37bp Fer-1 enhancer element. This promoter is set forth as SEQ ID 
NO: 128. The endogenous Fer-1 enhancer element is located upstream of 
the Fer-1 promoter (e.g., a Fer-1 oligo was cloned proximal to the core 
promoter). 
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As used herein, isolated, substantially pure nucleic acid, such as, 
for example, DNA, refers to nucleic acid fragments purified according to 
standard techniques employed by those skilled in the art, such as that 
found in Sambrook et ah ((2001) Molecular Cloning: A Laboratory 
5 Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 
3rd edition). 

As used herein, expression refers to the transcription and/or 
translation of nucleic acid. For example, expression can be the 
transcription of a gene that may be transcribed into an RNA molecule, 

10 such as a messenger RNA (mRNA) molecule. Expression may further 
include translation of an RNA molecule and translated into peptides, 
polypeptides, or proteins. If the nucleic acid is derived from genomic 
DNA, expression may, if an appropriate eukaryotic host cell or organism is 
selected, include splicing of the mRNA. With respect to an antisense 

1 5 construct, expression may refer to the transcription of the antisense DNA. 

As used herein, vector or plasmid refers to discrete elements that 
are used to introduce heterologous nucleic acids into cells for either 
expression of the heterologous nucleic acid or for replication of the 
heterologous nucleic acid. Selection and use of such vectors and 

20 plasmids are well within the level of skill of the art. 

As used herein, transformation/transfection refers to the process by 
which nucleic acid is introduced into cells. The terms transfection and 
transformation refer to the taking up of exogenous nucleic acid, e.g., an 
expression vector, by a host cell whether or not any coding sequences 

25 are in fact expressed. Numerous methods of transfection are known to 
the ordinarily skilled artisan, for example, by Agrobacterium-med\ated 
transformation, protoplast transformation (including polyethylene glycol 
(PEG)-mediated transformation, electroporation, protoplast fusion, and 
microcell fusion), lipid-mediated delivery, liposomes, electroporation, 
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sonoporation, microinjection, particle bombardment and silicon carbide 
whisker-mediated transformation and combinations thereof (see, e.g. , 
Paszkowski eta/. (1984) EMBO J. 3:27m -27 22; Potrykus eta/. (1985) 
Mo/. Gen. Genet. 7SS:169-177; Reich et aL (1986) Biotechnology 
5 4:1001-1004; Klein eta/. (1987) Nature 327:70-73; U.S. Patent No. 
6,143,949; Paszkowski et a/. (1989) in Cell Culture and Somatic Cell 
Genetics of Plants, Vol. 6, Molecular Biology of Plant Nuclear Genes, eds. 
Schell, J and Vasil, L.K. Academic Publishers, San Diego, California, p. 
52-68; and Frame et aL (1994) Plant J. 5:941-948), direct uptake using 

10 calcium phosphate (CaP04; see.e.g., Wigler et ai. (1979) Proc. Natl. 
Acad. Sci. U.S.A. 75:1373-1376), polyethylene glycol (PEG)-mediated 
DNA uptake, lipofection (see, e.g., Strauss (1996) Meth. Mo/. Biol. 
54:307-327), microcell fusion (see, EXAMPLES, see, also Lambert (1991) 
Proc. Natl. Acad. Sci. U.S.A. 55:5907-5911; U.S. Patent No. 5,396,767, 

15 Sawford et aL (1987) Somatic Cell Mol. Genet. 73:279-284; Dhar et aL 
(1984) Somatic Cell MoL Genet. 70:547-559; and McNeill-Killary et a/. 

(1995) Meth. EnzymoL 254:133-152), lipid-mediated carrier systems 
(see, e.g., Teifel eta/. (1995) Biotechniques 79:79-80; Albrecht eta/. 

(1996) Ann. Hematot. 72:73-79; Holmen et ai. (1995) /n Vitro Ce// Dev. 
20 Biol. Anim. 37:347-351; Remy eta/. (1994) Bioconjug. Chem. 5:647- 

654; Le Bolch eta/. (1995) Tetrahedron Lett. 35:6681-6684; Loeffler et 
a/. (1993) Meth. EnzymoL 277:599-618) or other suitable method. 
Methods for delivery of ACes are described in copending U.S. application 
Serial No. 09/815,979. Successful transfection is generally recognized 
25 by detection of the presence of the heterologous nucleic acid within the 
transfected cell, such as, for example, any visualization of the 
heterologous nucleic acid or any indication of the operation of a vector 
within the host cell. 
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As used herein, "delivery," which is used interchangeably with 
"transfection," refers to the process by which exogenous nucleic acid 
molecules are transferred into a cell such that they are located inside the 
cell. Delivery of nucleic acids is a distinct process from expression of 
5 nucleic acids. 

As used herein, injected refers to the microinjection, such as by 
use of a small syringe, needle, or pipette, for injection of nucleic acid into 
a cell. 

As used herein, substantially homologous DNA refers to DNA that 

10 includes a sequence of nucleotides that is sufficiently similar to another 
such sequence to form stable hybrids, with each other or a reference 
sequence, under specified conditions. 

It is well known to those of skill in this art that nucleic acid 
fragments with different sequences may, under the same conditions, 

15 hybridize detectably to the same "target" nucleic acid. Two nucleic acid 
fragments hybridize detectably, under stringent conditions over a 
sufficiently long hybridization period, because one fragment contains a 
segment of at least about 10, 1 4 or 1 6 or more nucleotides in a sequence 
that is complementary (or nearly complementary) to a substantially 

20 contiguous sequence of at least one segment in the other nucleic acid 
fragment. If the time during which hybridization is allowed to occur is 
held constant, at a value during which, under preselected stringency 
conditions, two nucleic acid fragments with complementary base-pairing 
segments hybridize detectably to each other, departures from exact 

25 complementarity can be introduced into the base-pairing segments, and 
base-pairing will nonetheless occur to an extent sufficient to make 
hybridization detectable. As the departure from complementarity between 
the base-pairing segments of two nucleic acids becomes larger, and as 
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conditions of the hybridization become more stringent, the probability 
decreases that the two segments will hybridize detectably to each other. 

Two single-stranded nucleic acid segments have "substantially the 
same sequence" , if (a) both form a base-paired duplex with the same 
5 segment, and (b) the melting temperatures of the two duplexes in a 

solution of 0.5 X SSPE differ by less than IOC. If the segments being 
compared have the same number of bases, then to have "substantially 
the same sequence", they will typically differ in their sequences at fewer 
than 1 base in 10. Methods for determining melting temperatures of 
lO nucleic acid duplexes are well known (see, e.g., Meinkoth eta/. (1984) 
Anal. Biochem. 738:267-284 and references cited therein). 

As used herein, a nucleic acid probe is a DNA or RNA fragment 
that includes a sufficient number of nucleotides to specifically hybridize to 
DNA or RNA that includes complementary or substantially complementary 
15 sequences of nucleotides. A probe may contain any number of 

nucleotides, from as few as about 10 and as many as hundreds of 
thousands of nucleotides. The conditions and protocols for such 
hybridization reactions are well known to those of skill in the art as are 
the effects of probe size, temperature, degree of mismatch, salt 
20 concentration and other parameters on the hybridization reaction. For 
example, the lower the temperature and higher the salt concentration at 
which the hybridization reaction is carried out, the greater the degree of 
mismatch that may be present in the hybrid molecules. 

To be used as a hybridization probe, the nucleic acid is generally 
25 rendered detectable by labeling it with a detectable moiety or label, such 
as 32 P, 3 H and 14 C, or by other means, including chemical labeling, such 
as by nick-translation in the presence of deoxyuridylate biotinylated at the 
5'-position of the uracil moiety. The resulting probe includes the 
biotinylated uridylate in place of thymidylate residues and can be detected 
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(via the biotin moieties) by any of a number of commercially available 
detection systems based on binding of streptavidin to the biotin. Such 
commercially available detection systems can be obtained, for example, 
from Enzo Biochemicals, Inc. (New York, NY). Any other label known to 
5 those of skill in the art, including non-radioactive labels, may be used as 
long as it renders the probes sufficiently detectable, which is a function of 
the sensitivity of the assay, the time available (for culturing cells, 
extracting DNA, and hybridization assays), the quantity of DNA or RNA 
available as a source of the probe, the particular label and the means used 
10 to detect the label. 

Once sequences with a sufficiently high degree of homology to the 
probe are identified, they can readily be isolated by standard techniques 
( seS/ e .g. t Sambrook eta!. (2001) Molecular Cloning: A Laboratory 
Manual, 3rd Edition, Cold Spring Harbor Laboratory Press). 
15 As used herein, conditions under which DNA molecules form stable 

hybrids are considered substantially homologous, and a DNA or nucleic 
acid homolog refers to a nucleic acid that includes a preselected 
conserved nucleotide sequence, such as a sequence encoding a 
polypeptide. By the term "substantially homologous" is meant having at 
20 least 75%, preferably 80%, preferably at least 90%, most preferably at 
least 95% homology therewith or a less percentage of homology or 
identity and conserved biological activity or function. 

The terms "homology" and "identity" are often used 
interchangeably. In this regard, percent homology or identity may be 
25 determined, for example, by comparing sequence information using a GAP 
computer program. The GAP program utilizes the alignment method of 
Needleman and Wunsch (J. Mol. Biol. 48:443 (1970), as revised by Smith 
and Waterman {Adv. Appl. Math. 2:482 (1981). Briefly, the GAP 
program defines similarity as the number of aligned symbols (i.e., 
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nucleotides or amino acids) which are similar, divided by the total number 
of symbols in the shorter of the two sequences. The preferred default 
parameters for the GAP program may include: (1) a unary comparison 
matrix (containing a value of 1 for identities and O for non-identities) and 
the weighted comparison matrix of Gribskov and Burgess, Nucl. Acids 
Res. 14:6745 (1986), as described by Schwartz and Dayhoff, eds., 
ATLAS OF PROTEIN SEQUENCE AND STRUCTURE, National Biomedical 
Research Foundation, pp. 353-358 (1979); (2) a penalty of 3.0 for each 
gap and an additional 0.10 penalty for each symbol in each gap; and (3) 
no penalty for end gaps. 

By sequence identity, the number of conserved amino acids are 
determined by standard alignment algorithms programs, and are used with 
default gap penalties established by each supplier. Substantially 
homologous nucleic acid molecules would hybridize typically at moderate 
stringency or at high stringency all along the length of the nucleic acid of 
interest. Preferably the two molecules will hybridize under conditions of 
high stringency. Also contemplated are nucleic acid molecules that 
contain degenerate codons in place of codons in the hybridizing nucleic 
acid molecule. 

Whether any two nucleic acid molecules have nucleotide sequences 
that are at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99% 
"identical" can be determined using known computer algorithms such as 
the "FAST A" program, using for example, the default parameters as in 
Pearson and Lipman, Proc. Nat/. Acad. ScL USA 85:2444 (1988). 
Alternatively the BLAST function of the National Center for Biotechnology 
Information database may be used to determine relative sequence 
identity. 

In general, sequences are aligned so that the highest order match 
is obtained. "Identity" per se has an art-recognized meaning and can be 
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calculated using published techniques. (See, e.g.: Computational 
Molecular Biology, Lesk, A.M., ed., Oxford University Press, New York, 
1988; Biocomputing: Informatics and Genome Projects, Smith, D.W., ed., 
Academic Press, New York, 1993; Computer Analysis of Sequence Data, 
5 Part I, Griffin, A.M., and Griffin, H.G., eds., Humana Press, New Jersey, 
1994; Sequence Analysis in Molecular Biology, von Heinje, G., Academic 
Press, 1 987; and Sequence Analysis Primer, Gribskov, M. and Devereux, 
J., eds., M Stockton Press, New York, 1991). While there exist a number 
of methods to measure identity between two polynucleotide or 

10 polypeptide sequences, the term "identity" is well known to skilled 

artisans (Carillo, H. & Lipton, D., S/AM J Applied Math 45:1073 (1988)). 
Methods commonly employed to determine identity or similarity between 
two sequences include, but are not limited to, those disclosed in Guide to 
Huge Computers, Martin J. Bishop, ed.. Academic Press, San Diego, 

15 1994, and Carillo, H. & Lipton, D., SIAM J Applied Math 4SA013 
(1988). Methods to determine identity and similarity are codified in 
computer programs. Preferred computer program methods to determine 
identity and similarity between two sequences include, but are not limited 
to, GCG program package (Devereux, J., et al.. Nucleic Acids Research 

20 72flJ:3S7 (1984)), BLASTP, BLASTN, FASTA (Atschul, S.F., efa/., J 
Molec Biol 2 73:403 (1990)). 

Therefore, as used herein, the term "identity" represents a 
comparison between a test and a reference polypeptide or polynucleotide. 
For example, a test polypeptide may be defined as any polypeptide that is 

25 90% or more identical to a reference polypeptide. 

As used herein, the term at least "90% identical to" refers to 
percent identities from 90 to 99.99 relative to the reference polypeptides. 
Identity at a level of 90% or more is indicative of the fact that, assuming 
for exemplification purposes a test and reference polynucleotide length of 
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100 amino acids are compared. No more than 10% (i.e., 10 out of 100) 
amino acids in the test polypeptide differs from that of the reference 
polypeptides. Similar comparisons may be made between a test and 
reference polynucleotides. Such differences may be represented as point 
5 mutations randomly distributed over the entire length of an amino acid 
sequence or they may be clustered in one or more locations of varying 
length up to the maximum allowable, e.g. 10/100 amino acid difference 
(approximately 90% identity). Differences are defined as nucleic acid or 
amino acid substitutions, or deletions. 
10 As used herein: stringency of hybridization in determining 

percentage mismatch encompass the following conditions or equivalent 
conditions thereto: 

1) high stringency: 0.1 x SSPE or SSC, 0.1% SDS, 65°C 

2) medium stringency: 0.2 x SSPE or SSC, 0.1% SDS, 50°C 
1 5 3) low stringency: 1 .0 x SSPE or SSC, 0. 1 % SDS, 50°C 

or any combination of salt and temperature and other reagents that result 
in selection of the same degree of mismatch or matching. Equivalent 
conditions refer to conditions that select for substantially the same 
percentage of mismatch in the resulting hybrids. Additions of ingredients, 

20 such as formamide, Ficoll, and Denhardt's solution affect parameters such 
as the temperature under which the hybridization should be conducted 
and the rate of the reaction. Thus, hybridization in 5 X SSC, in 20% 
formamide at 42° C is substantially the same as the conditions recited 
above hybridization under conditions of low stringency. The recipes for 

25 SSPE, SSC and Denhardt's and the preparation of deionized formamide 
are described, for example, in Sambrook et al. (1989) Molecular Cloning, 
A Laboratory Manual, Cold Spring Harbor Laboratory Press, Chapter 8; 
see, Sambrook et aL, vol. 3, p. B.13, see, also, numerous catalogs that 
describe commonly used laboratory solutions. It is understood that 
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15 



equivalent stringencies may be achieved using alternative buffers, salts 
and temperatures. As used herein, all assays and procedures, such as 
hybridization reactions and antibody-antigen reactions, unless otherwise 
specified, are conducted under conditions recognized by those of skill in 
the art as standard conditions. 

As used herein, conservative amino acid substitutions, such as 
those set forth in Table 1 , are those that do not eliminate biological 
activity. Suitable conservative substitutions of amino acids are known to 
those of skill in this art and may be made generally without altering the 
biological activity of the resulting molecule. Those of skill in this art 
recognize that, in general, single amino acid substitutions in non-essential 
regions of a polypeptide do not substantially alter biological activity (see, 
e.g., Watson et al. Molecular Biology of the Gene, 4th Edition, 1987, The 
Bejacmin/Cummings Pub. co., p. 224). Conservative amino acid 
substitutions are made, for example, in accordance with those set forth in 
TABLE 1 as follows: 

TABLE 1 





Original residue 


Conservative substitution 




Ala (A) 


Gly; Ser, Abu 


20 


Arg (R) 


Lys, orn 




Asn (N) 


Gin; His 




Cys (C) 


Ser 




Gin (Q) 


Asn 




Glu (E) 


Asp 


25 


Gly (G) 


Ala; Pro 




His (H) 


Asn; Gin 




He (I) 


Leu; Val; Met; Nle; Nva 




Leu (L) 


lie; Val; Met; Nle; Nva 




Lys <K) 


Arg; Gin; Glu 


30 


Met (M) 


Leu; Tyr; He; NLe Val 




Ornithine 


Lys; Arg 




Phe (F) 


Met; Leu; Tyr 




Ser (S) 


Thr 




Thr (T) 


Ser 


35 


Trp (W) 


Tyr 




Tyr (Y) 


Trp; Phe 




Val (V) 


He; Leu; Met; Nle; Nva 
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Other substitutions are also permissible and may be determined 
empirically or in accord with known conservative substitutions. 

As used herein r the amino acids, which occur in the various amino 
acid sequences appearing herein, are identified according to their well- 
known, three-letter or one-letter abbreviations. The nucleotides, which 
occur in the various DNA fragments, are designated with the standard 
single-letter designations used routinely in the art. 

As used herein, a splice variant refers to a variant produced by 
differential processing of a primary transcript of genomic DNA that results 
in more than one type of mRNA. 

As used herein, a probe or primer based on a nucleotide sequence 
includes at least 10, 14, 16, 30 or 100 contiguous nucleotides from the 
reference nucleic acid molecule. 

As used herein, recombinant production by using recombinant DNA 
methods refers to the use of the well known methods of molecular 
biology for expressing proteins encoded by cloned DNA. 

As used herein, biological activity refers to the in vivo activities of 
a compound or physiological responses that result upon in vivo 
administration of a compound, composition or other mixture. Biological 
activity, thus, encompasses therapeutic effects and pharmaceutical 
activity of such compounds, compositions and mixtures. Biological 
activities may be observed in in vitro systems designed to test or use 
such activities. Thus, for purposes herein the biological activity of a 
luciferase is its oxygenase activity whereby, upon oxidation of a 
substrate, light is produced. 
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The terms substantially identical or similar varies with the context 
as understood by those skilled in the relevant art and generally means at 
least 40, 60, 80, 90, 95 or 98%. 

As used herein, substantially identical to a product means 
5 sufficiently similar so that the property is sufficiently unchanged so that 
the substantially identical product can be used in place of the product. 

As used herein, substantially pure means sufficiently homogeneous 
to appear free of readily detectable impurities as determined by standard 
methods of analysis, such as thin layer chromatography (TLC), gel 
10 electrophoresis and high performance liquid chromatography (HPLC), used 
by those of skill in the art to assess such purity, or sufficiently pure such 
that further purification would not detectably alter the physical and 
chemical properties, such as enzymatic and biological activities, of the 
substance. Methods for purification of the compounds to produce 
15 substantially chemically pure compounds are known to those of skill in 
the art. A substantially chemically pure compound may, however, be a 
mixture of stereoisomers or isomers. In such instances, further 
purification might increase the specific activity of the compound. 

As used herein, vector (or plasmid) refers to discrete elements that 
20 are used to introduce heterologous DNA into cells for either expression or 
replication thereof. The vectors typically remain episomal, but may be 
designed to effect integration of a gene or portion thereof into a 
chromosome of the genome. Also contemplated are vectors that are 
artificial chromosomes, such as yeast artificial chromosomes and 
25 mammalian artificial chromosomes. Selection and use of such vehicles 

are well known to those of skill in the art. An expression vector includes 
vectors capable of expressing DNA that is operatively linked with 
regulatory sequences, such as promoter regions, that are capable of 
effecting expression of such DNA fragments. Thus, an expression vector 
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refers to a recombinant DNA or RNA construct, such as a plasmid, a 
phage, recombinant virus or other vector that, upon introduction into an 
appropriate host cell, results in expression of the cloned DNA. 
Appropriate expression vectors are well known to those of skill in the art 
5 and include those that are replicable in eukaryotic cells and/or prokaryotic 
cells and those that remain episomal or those which integrate into the 
host cell genome. 

As used herein, protein-binding-sequence refers to a protein or 
peptide sequence that is capable of specific binding to other protein or 
10 peptide sequences generally, to a set of protein or peptide sequences or 
to a particular protein or peptide sequence. 

As used herein, a composition refers to any mixture of two or more 
ingredients. It may be a solution, a suspension, liquid, powder, a paste, 
aqueous, non-aqueous or any combination thereof. 
15 As used herein, a combination refers to any association between 

two or more items. 

As used herein, fluid refers to any composition that can flow. 
Fluids thus encompass compositions that are in the form of semi-solids, 
pastes, solutions, aqueous mixtures, gels, lotions, creams and other such 
20 compositions. 

As used herein, a cellular extract refers to a preparation or fraction 
that is made from a lysed or disrupted cell. 

As used herein, the term "subject" refers to animals, plants, 
insects, and birds and other phyla, genera and species into which nucleic 
25 acid molecules may be introduced. Included are higher organisms, such 
as mammals, fish, insects and birds, including humans, primates, cattle, 
pigs, rabbits, goats, sheep, mice, rats, guinea pigs, hamsters, cats, dogs, 
horses, chicken and others. 
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As used herein, flow cytometry refers to processes that use a laser 
based instrument capable of analyzing and sorting out cells and or 
chromosomes based on size and fluorescence. 

As used herein, the abbreviations for any protective groups, amino 
acids and other compounds, are, unless indicated otherwise, in accord 
with their common usage, recognized abbreviations, or the IUPAC-IUB 
Commission on Biochemical Nomenclature (see, {1 972) Biochem. 
7 7:942-944). 

B. Recombination systems 

Site-specific recombination systems typically contain three 
elements: a pair of DNA sequences (the site-specific recombination 
sequences) and a specific enzyme (the site-specific recombinase). The 
site-specific recombinase catalyzes a recombination reaction between two 
site-specific recombination sequences. 

A number of different site-specific recombinase systems are 
available and/or known to those of skill in the art, including, but not 
limited to: the Cre//ox recombination system using CRE recombinase (see, 
e.g., SEQ ID Nos. 58 and 59) from the Escherichia coli phage P1 (see, 
e.g., Sauer (1993) Methods in Enzymology 22S:89O-9O0; Sauer et al. 
(1990) The A/ew Biologist 2:441-449), Sauer (1994) Current Opinion in 
Biotechnoiogy S:521 -527; Odell et al. (1990) Moi Gen Genet. 223:369- 
378; Lasko et al. (1992) Proc. Natl. Acad. Sci. U.S.A. 55:6232-6236; 
U.S. Patent No. 5,658,772), the FLP/FRT system of yeast using the FLP 
recombinase (see, SEQ ID Nos. 60 and 61) from the 2// episome of 
Saccharomyces cerevisiae (Cox (1983) Proc. Natl. Acad. Sci. U.S.A. 
80:4223; Falco et al. (1982) Cell 25:573-584; Golic et al. (1989) 
Ce//59:499-509; U.S. Patent No. 5,744,336), the resolvases, including 
Gin recombinase of phage Mu (Maeser et al. (1991) Mol Gen Genet. 
230:170-176; Klippel, A. et al (1993) EM BO J. 72:1 047-1 057; see, e.g., 
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SEQ ID Nos. 64-67), Cin, Hin, oS Tn3; the Pin recombinase of E. coli 
(see, e.g., SEQ ID Nos. 68 and 69; Enomoto et a!. (1983) J Bacteriol. 
#1663-668), the R/RS system of the pSRI plasmid of Zygosaccharomyces 
rouxii (Araki et al. (1992) J. Mol. Biol. 225:25-37; Matsuzaki eta/. (1990) 
5 J. Bacteriol. 172\ 610-618) and site-specific recombinases from 

Kluyveromyces drosophilarium (Chen et al. (1986) Nucleic Acids Res. 
31 4:447 1-4481) and Kluyveromyces waltii (Chen et al. (1992) J. Gen. 
Microbiol. 735:337-345). Other systems are known to those of skill in 
the art (Stark et al. Trends Genet. 5:432-439; Utatsu et al. (1987) J. 

10 Bacteriol. 765:5537-5545; see, also, U.S. Patent No. 6,171,861). 

Members of the highly related family of site-specific recombinases, 
the resolvase family, such as y6, Tn3 resolvase, Hin, Gin, and Cin are also 
available. Members of this family of recombinases are typically 
constrained to intramolecular reactions (e.g., inversions and excisions) 

15 and can require host-encoded factors. Mutants have been isolated that 
relieve some of the requirements for host factors (Maeser et al. (1991) 
Mol. Gen. Genet. 230:170-176), as well as some of the constraints 
of intramolecular recombination (see, U.S. Patent No. 6,171,861). 

The bacteriophage P1 Cre/lox and the yeast FLP/FRT systems are 

20 particularly useful systems for site-specific integration, inversion or 
excision of heterologous nucleic acid into, and out of, chromosomes, 
particularly ACes as provided herein. In these systems a recombinase 
(Cre or FLP) interacts specifically with its respective site-specific 
recombination sequence (lox or FRT, respectively) to invert or excise the 

25 intervening sequences. The sequence for each of these two systems is 
relatively short (34 bp for lox and 47 bp for FRT). 

The FLP/FRT recombinase system has been demonstrated to 
function efficiently in plant cells (U.S. Patent No. 5,744,386), and, thus, 
can be used for producing plant artificial chromosome platforms. In 
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general, short incomplete FRT sites leads to higher accumulation of 
excision products than the complete full-length FRT sites. The system 
catalyzes intra- and intermolecular reactions, and, thus, can be used for 
DNA excision and integration reactions. The recombination reaction is 
5 reversible and this reversibility can compromise the efficiency of the 
reaction in each direction. Altering the structure of the site-specific 
recombination sequences is one approach to remedying this situation. 
The site-specific recombination sequence can be mutated in a manner 
that the product of the recombination reaction is no longer recognized as 

10 a substrate for the reverse reaction, thereby stabilizing the integration or 
excision event. 

In the Cre-lox system, discovered in bacteriophage P1, 
recombination between loxP sites occurs in the presence of the Cre 
recombinase (see, e.g., U.S. Patent No. 5,658,772). This system can be 

15 used to insert, invert or excise nucleic acid located between two lox sites. 
Cre can be expressed from a vector. Since the lox site is an asymmetrical 
nucleotide sequence, lox sites on the same DNA molecule can have the 
same or opposite orientation with respect to each other. Recombination 
between lox sites in the same orientation results in a deletion of the DNA 

20 segment located between the two lox sites and a connection between the 
resulting ends of the original DNA molecule. The deleted DNA segment 
forms a circular molecule of DNA. The original DNA molecule and the 
resulting circular molecule each contain a single lox site. Recombination 
between lox sites in opposite orientations on the same DNA molecule 

25 result in an inversion of the nucleotide sequence of the DNA segment 
located between the two lox sites. In addition, reciprocal exchange of 
DNA segments proximate to lox sites located on two different DNA 
molecules can occur. All of these recombination events are catalyzed by 
the product of the Cre coding region. 
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Any site-specific recombinase system known to those of skill in the 
art is contemplated for use herein. It is contemplated that one or a 
plurality of sites that direct the recombination by the recombinase are 
introduced into an artificial chromosome to produce platform ACes. The 

5 resulting platform ACes are introduced into cells with nucleic acid 

encoding the cognate recombinase, typically on a vector, and nucleic acid 
encoding heterologous nucleic acid of interest linked to the appropriate 
recombination site for insertion into the platform ACes. The recombinase- 
encoding-nucleic acid may be introduced into the cells on the same 

10 vector, or a different vector, encoding the heterologous nucleic acid. 

An E. co/i phage lambda integrase system for ACes platform 
engineering and for artificial chromosome engineering is provided (Lorbach 
eta/. (2000) J. Mo/. B/ol 296:1 175-11 81). The phage lambda integrase 
(Landy, A. (1989) Annu. Rev. Biochem. 55:913-94) is adapted herein and 

15 the cognate att sites are provided. Chromosomes, including ACes, 

engineered to contain one or a plurality of att sites are provided, as are 
vectors encoding a mutant integrase that functions in the absence other 
factors. Methods using the modified chromosomes and vectors for 
introduction of heterologous nucleic acid are also provided. 

20 For purposes herein, one or more of the sites (e.g., a single site or 

a pair of sites) required for recombination are introduced into an artificial 
chromosome, such as an ACes chromosome. The enzyme for catalyzing 
site-directed recombination is introduced with the DNA of interest, or 
separately, or is engineered onto the artificial chromosome under the 

25 control of a regulatable promoter. 

As described herein, artificial chromosome platforms containing one 
or multiple recombination sites are provided. The methods and resulting 
products are exemplified with the lambda phage Att/lnt system, but 
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similar methods may be used for production of ACes platforms with other 
recombination systems. 

The Att/lnt system and vectors provided herein are not only 
intended for engineering ACes platforms, but may be used to engineer an 
5 Att/lnt system into any chromosome. Introduction of att sites into a 

chromosome will permit engineering of natural chromosomes, such as by 
permitting targeted integration genes or regulatory regions, and by 
controlled excision of selected regions. For example, genes encoding a 
particular trait may be added to a chromosome, such as plant 
10 chromosome engineered to contain one or plurality of att sites. Such 
chromosomes may be used for screening DNA to identify genes. Large 
pieces of DNA can be introduced into cells and the cells screened 
phenotypically to select those having the desired trait. 
C. Platforms 

15 Provided herein are platform artificial chromosomes (platform ACes) 

containing single or multiple site-specific recombination sites. 
Chromosome-based platform technology permits efficient and tractable 
engineering and subsequent expression of multiple gene targets. Methods 
are provided that use DNA vectors and fragments to create platform 

20 artificial chromosomes, including animal, particularly mammalian, artificial 
chromosomes, and plant artificial chromosomes. The artificial 
chromosomes contain either single or multiple sequence-specific 
recombination sites suitable for the placement of target gene expression 
vectors onto the platform chromosome. The engineered chromosome- 

25 based platform ACes technology is applicable for methods, including 
cellular and transgenic protein production, transgenic plant and animal 
production and gene therapy. The platform ACes are also useful for 
producing a library of ACes comprising random portions of a given 
genome (e.g., a mammalian, plant or prokaryotic genome) for genomic 
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screening; as well as a library of cells comprising different and/or mutually 
exclusive ACes therein. 

Exemplary of artificial chromosome platforms are those based on 
ACes. ACes artificial chromosomes are non-viral, self-replicating nucleic 
5 acid molecules that function as a natural chromosome, having all the 
elements required for normal chromosomal replication and maintenance 
within the cell nucleus. ACes artificial chromosomes do not rely on 
integration into the genome of the cell to be effective, and they are not 
limited by DNA carrying capacity and as such the therapeutic gene(s) of 

10 interest, including regulatory sequences, can be engineered into the ACes. 
In addition, ACes are stable in vitro and in vivo and can provide 
predictable long-term gene expression. Once engineered and delivered to 
the appropriate cell or embryo, ACes work independently alongside host 
chromosomes, for ACes that are predominantly heterochromatin 

15 producing only the products (proteins) from the genes it carries. As 

provided herein ACes are modified by introduction of recombination site(s) 
to provide a platform for ready introduction of heterologous nucleic acid. 
The ACes platforms can be used for production of transgenic animals and 
plants; as vectors for genetic therapy; for use as protein production 

20 systems; for animal models to identify and target new therapeutics; in cell 
culture for the development and production of therapeutic proteins; and 
for a variety of other applications. 

1 . Generation of artificial chromosomes 

Artificial chromosomes may be generated by any method known to 
25 those of skill in the art. Of particular interest herein are the ACes artificial 
chromosomes, which contain a repeated unit. Methods for production of 
ACes are described in detail in U.S. Patent Nos. 6,025,155 and 
6,077,697, which, as with all patents, applications, publications and 
other disclosure, are incorporated herein in their entirety. 
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Generation of de novo ACes. 

ACes can be generated by cotransfecting exogenous DNA — such 
as a mammary tissue 'specific DNA cassette including the gene sequences 
for a therapeutic protein, with a rDNA fragment and a drug resistance 
5 marker gene into the desired eukaryotic cell, such as plant or animal cells, 
such as murine cells in vitro. DNA with a selectable or detectable marker 
is introduced, and can be allowed to integrate randomly into pericentric 
heterochromatin or can be targeted to pericentric heterochromatin, such 
as that in rDNA gene arrays that reside on acrocentric chromosomes, 

10 such as the short arms of acrocentric chromosomes. This integration 
event activates the "megareplicator" sequence and amplifies the 
pericentric heterochromatin and the exogenous DNA, and duplicates a 
centromere. Ensuing breakage of this "dicentric" chromosome can result 
in the production of daughter cells that contain the substantially-original 

15 chromosome and the new artificial chromosome. The resulting ACes 
contain all the essential elements needed for stability and replication in 
dividing cells — centromere, origins of replications, and telomeres. ACes 
have been produced that express marker genes (lacZ, green fluorescent 
protein, neomycin-resistance, puromycin-resistance, hygromycin- 

20 resistance) and genes of interest. Isolated ACes, for example, have been 
successfully transferred intact to rodent, human, and bovine cells by 
electroporation, sonoporation, microinjection, and transfection with lipids 
and dendrimers. 

To render the creation of ACes with desired genes more tractable 
25 and efficient, "platform" ACes (platform-/! Ces) can be produced that 

contain defined DNA sequences for enzyme-mediated homologous DNA 
recombination, such as by Cre or FLP recombinases (Bouhassira et at. 
(1996) Blood 88(supplement /;:190a; Bouhassira et al. (1997) Blood, 
50:3332-3344; Siebler ef al. (1997) Biochemistry: 35:1740-1747; 
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Siebler et al. (1998) Biochemistry 37: 6229-6234; and Bethke et al. 

(1997) Nucl. Acids Res. 25:2828-2834), and as exemplified herein the 

lambda phage integrase. A fox site contains two 13 bp inverted repeats 

to which Cre-recombinase binds and an intervening 8 bp core region. 

5 Only pairs of sites having identity in the central 6 bp of the core region 

are proficient for recombination; sites having non-identical core sequences 

(heterospecific fox sites) do not efficiently recombine with each other 

(Hoess et aL (1986) Nucleic Acids Res. 7^:2287-2300). 

Generating acrocentric chromosomes for plant 
10 artificial chromosome formation. 

In human and mouse cells de novo formation of a satellite DNA 

based artificial chromosome (SATAC, also referred to as ACes) can occur 

in an acrocentric chromosome where the short arm contains only 

pericentric heterochromatin, the rDNA array, and telomere sequences. 

15 Plant species may not have any acrocentric chromosomes with the same 
physical structure described, but "megareplicator" DNA sequences reside 
in the plant rDNA arrays, also known as the nucleolar organizing regions 
(NOR). A structure like those seen in acrocentric mammalian 
chromosomes can be generated using site-specific recombination between 

20 appropriate arms of plant chromosomes. 
Approach 

Qin et al. ((1994) Proc. Natl. Acad. Set. U.S.A. 3 7:1706-1710, 
1994) describes crossing two Nicotiana tabacum transgenic plants. One 
plant contains a construct encoding a promoterless hygromycin-resistance 
25 gene preceded by a fox site (lox-bpt), the other plant carries a construct 
containing a cauliflower mosaic virus 35S promoter linked to a lox 
sequence and the ere DNA recombinase coding region (35S-/ox-cre) . The 
constructs were introduced separately by infecting leaf explants with 
agrobacterium tumefaciens which carries the kanamycin-resistance gene 
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(Kan R ). The resultant Kan R transgenic plants were crossed. Plants that 
carried the appropriate DNA recombination event were identified by 
hygromycin-resistance. 

5 Modification of the above for generation of ACes 

The Kan R cultivars are initially screened, such as by FISH, to 

identify two sets of candidate transgenic plants. One set has one 

construct integrated in regions adjacent to the pericentric heterochromatin 

on the short arm of any chromosome. The second set of candidate plants 

10 has the other construct integrated in the NOR region of appropriate 

chromosomes. To obtain reciprocal translocation both sites must be in 

the same orientation. Therefore a series of crosses are required, Kan R 

plants generated, and FISH analyses performed to identify the appropriate 

"acrocentric" plant chromosome for de novo plant ACes formation. 

15 2. Bacteriophage lambda integrase-based site-specific 

recombination system 

An integral part of the platform technology includes a site-specific 

recombination system that allows the placement of selected gene targets 

or genomic fragments onto the platform chromosomes. Any such system 

20 may be used. In particular, a method is provided for insertion of 

additional DNA fragments into the platform chromosome residing in the 
cell via sequence-specific recombination using the recombinase activity of 
the bacteriophage lambda integrase. The lambda integrase system is 
exemplary of the recombination systems contemplated for ACes. Any 

25 known recombination system, including any described herein, particularly 
any that operates without the need for additional factors or that, by virtue 
of mutation, does not require additional factors, is contemplated. 
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As noted the lambda integrase system provided herein can be used 

with natural chromosomes and artificial chromosomes in addition to 

ACes. Single or a plurality of recombination sites, which may be the 

same or different, are introduced into artificial chromosomes to produce 

5 artificial chromosome platforms. 

3. Creation of bacteriophage lambda integrase site-specific 
recombination system 

The lambda phage-encoded integrase (designated Int) is a 

prototypical member of the integrase family. Int effects integration and 

1 0 excision of the phage in and out of the E. coli genome via recombination 
between pairs of attachment sites designated attB/attP and attL/attR. 
Each att site contains two inverted 9 base pair core Int binding sites and a 
7 base pair overlap region that is identical in wild-type att sites. Each 
site, except for attB contains additional Int binding sites. In flanking 

15 regions, there are recognition sequences for accessory DNA binding 
proteins, such as integration host factor (IHF), factor for inversion 
stimulation (FIS) and the phage encoded excision protein (XIS). Except 
for attB, Int is a heterobivalent DNA-binding protein and, with assistance 
from the accessory proteins and negative DNA supercoiling, binds 

20 simultaneously to core and arm sites within the same att site. 

Int, like Cre and FLP, executes an ordered sequential pair of strand 
exchanges during integrative and excisive recombination. The natural 
pairs of target sequences for Int, attB and attP or attL and attR are 
located on the same or different DNA molecules resulting in intra or 

25 intermolecular recombination, respectively. For example, intramolecular 
recombination occurs between inversely oriented attB and attP, or 
between attL and attR sequences, respectively, leading to inversion of the 
intervening DNA segment. 
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Like the recombinase systems, such as Cre and FLP, Int directs 
site-specific recombination. Unlike the other systems, such Cre and FLP, 
Int generally requires additional protein factors for integrative and excisive 
recombination and negative supercoiling for integrative recombination. 
5 Hence, the Int system had not been used in eukaryotic targeting systems. 
Mutant Int proteins, designated Int-h (E174K) and a derivative 
thereof lnt-h/21 8(E1 74K/E21 8K) do not require accessory proteins to 
perform intramolecular integrative and excisive recombination in co- 
transfection assays in human cells (Lorbach eta/. (2000) J MoL Biol. 

10 296\ \ 175-1 181); wild-type Int does not catalyze intramolecular 
recombination in human cells harboring target sites attB and affP. 
Hence it had been demonstrated that mutant Int can catalyze factor- 
independent recombination events in human cells. 

There has been no demonstration by others that this system can be 

15 used for engineering of eukaryotic genomes or chromosomes. Provided 
herein are chromosomes, including artificial chromosomes, such as but 
not limited to ACes that contain att sites (e.g., platform ACes), and the 
use of such chromosomes for targeted integration of heterologous DNA 
into such chromosomes in eukaryotic cells, including animal, such as 

20 rodent and human, and plant cells. Mutant Int provided herein is shown 
to effect site-directed recombination between sites in artificial 
chromosomes and vectors containing cognate sites. 

An additional component of the chromosome-based platform 
technology is the site-specific integration of target DNA sequences onto 

25 the platform. For this the native bacteriophage lambda integrase has 
been modified to carry out this sequence specific DNA recombination 
event in eukaryotic cells. The bacteriophage lambda integrase and its 
cognate DNA substrate att is a member of the site-specific recombinase 
family that also includes the bacteriophage PI Cre/lox system as well as 
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the Saccharomyces cerevisiae 2 micron based FLP/FRT system (see, e.g., 
Landy (1989) Ann. Rev. Biochem 55:913-949; Hoess eta/. (1982) Proc. 
Natl. Acad. Set. U.S.A. 73:3398-3402; Broach eta/. (1982) Cell 29:221- 
234). 

5 By combining DNA endonuclease and DNA ligase activity these 

recombinases recognize and catalyze DNA exchanges between sequences 
flanking the recognition site. During the integration of lambda genome 
into the E. coif (lambda recombination) genome, the phage integrase (INT) 
in association with accessory proteins catalyzes the DNA exchange 

10 between the attP site of the phage genome and the attB site of the 

bacterial genome resulting in the formation of attL and attR sites (Figure 
6). The engineered bacteriophage lambda integrase has been produced 
herein to carry out an intermolecular DNA recombination event between 
an incoming DNA molecule (primarily on a vector containing the bacterial 

1 5 attB site) and the chromosome-based platform carrying the lambda attP 
sequence independent of lambda bacteriophage or bacterial accessory 
proteins. 

In contrast to the bi-directional Cre/lox and FLP/FRT system, the 

engineered lambda recombination system derived for chromosome-based 

20 platform technology is advantageously unidirectional because accessory 

proteins, which are absent, are required for excision of integrated nucleic 

acid upon further exposure to the lambda Int recombinase. 

4. Creation of platform chromosome containing single or 
multiple sequence-specific recombination sites 

25 a. Multiple sites 

For the creation of a platform chromosome containing multiple, 
sequence-specific recombination sites, artificial chromosomes are 
produced as depicted in Figure 5 and Example 3. As discussed above, 
artificial chromosomes can be produced using any suitable methodology, 
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including those described in U.S. Patent Nos. 5,288,625; 5,712,134; 
5,891,691; 6,025,155. Briefly, to prepare artificial chromosomes 
containing multiple recombination (e.g., integration) sites, nucleic acid 
(either in the form a one or more plasmids, such as the plasmid 
5 pSV401 93attPsensePUR set forth in Example 3) is targeted into an 

amplifiable region of a chromosome, such as the pericentric region of a 
chromosome. Among such regions are the rDNA gene loci in acrocentric 
mammalian chromosomes. Hence, targeting nucleic acid for integration 
into the rDNA region of mammalian acrocentric chromosomes can include 

10 the mouse rDNA fragments (for targeting into rodent cell lines) or large 
human rDNA regions on BAC/PAC vectors (or subclones thereof in 
standard vectors) for targeting into human acrocentric chromosomes, 
such as for human gene therapy applications. The targeting nucleic acid 
generally includes a detectable or selectable marker, such as antibiotic 

15 resistance, such as puromycin and hygromycin, a recombination site 
(such as attP, attB, attL, attR or the like), and/or human selectable 
markers as required for gene therapy applications. Cells are grown under 
conditions that result in amplification and ultimately production of ACes 
artificial chromosomes having multiple recombination (e.g. integration) 

20 sites therein. ACes having the desired size are selected for further 
engineering. 

b. Creation of platform chromosome containing a 
single sequence-specific recombination site 

In this method a mammalian platform artificial chromosome is 

25 generated containing a single sequence-specific recombination site. In 

the Example below, this approach is demonstrated using a puromycin 

resistance marker for selection and a mouse rDNA fragment for targeting 

into the rDNA locus on mouse acrocentric chromosomes. Other selection 

markers and targeting DNA sequences as desired and known to those of 
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skill in the art can be used. Additional resistance markers include genes 
conferring resistance to the antibiotics neomycin, blasticidin, hygromycin 
and zeocin. For applications, such as gene therapy in which potentially 
immunogenic responses are to be avoided, host, such as human, derived 
5 selectable markers or markers detectable with monoclonal antibodies 

(MAb) followed by fluorescent activated cell sorting (FACS) can be used. 
Examples in this class include, but are not limited to: human nerve growth 
factor receptor (detection with MAb); truncated human growth factor 
receptor (detection with MAb); mutant human dihydrofolate reductase 
10 (DHFR; detectable using a fluorescent methotrexate substrate); secreted 
alkaline phosphatase (SEAP; detectable with fluorescent substrate); 
thymidylate synthase (TS; confers resistance to fluorodeoxyuridine); 
human CAD gene (confers resistance to N-phosphonacetyl-L-aspartate 
(PALA)). 

15 To construct a platform artificial chromosome with a single site, an 

ACes artificial chromosome (or other artificial chromosome of interest) 
can be produced containing a selectable marker. A single sequence 
specific recombination site is targeted onto ACes via homologous 
recombination. For this, DNA sequences containing the site-specific 

20 recombination sequence are flanked with DNA sequences homologous to 
a selected sequence in the chromosome. For example, when using a 
chromosome containing rDNA or satellite DNA, such DNA can be used as 
homologous sequences to target the site-specific recombination sequence 
onto the chromosome. A vector is designed to have these homologous 

25 sequences flanking the site-specific recombination site and, after the 

appropriate restriction enzyme digest to generate free ends of homology 
to the chromosome, the DNA is transfected into cells harboring the 
chromosome. After transfection and integration of the site-specific 
cassette, homologous recombination events onto the platform 
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chromosome are subcloned and identified, for example by screening 

single cell subclones via expression of resistance or a fluorescent marker 

and PGR analysis. In one embodiment, a platform artificial chromosome, 

such as a platform ACes, that contains a single copy of the recombination 

5 site is selected. Examples 2B and 2D exemplify the process, and Figure 3 

provides a diagram depicting one method for the creation of a platform 

mammalian chromosome containing a single sequence-specific 

recombination site. 

5. Lambda integrase mediated recombination of target gene 
10 expression vector onto platform chromosome 

The third component of the chromosome-based platform 

technology involves the use of target gene expression vectors carrying, 

for example, genes for gene therapy, genes for transgenic animal or plant 

production, and those required for cellular protein production of interest. 

15 Using lambda integrase mediated site-specific recombination, or any other 
recombinase-mediated site-specific recombination, the target gene 
expression vectors are introduced onto the selected chromosome 
platform. The use of target gene expression vector permits use of the de 
novo generated chromosome-based platforms for a wide range of gene 

20 targets. Furthermore, chromosome platforms containing multiple attP 

sites provides the opportunity to incorporate multiple gene targets onto a 
single platform, thereby providing for expression of multiple gene targets, 
including the expression of cellular and genetic regulatory genes and the 
expression of all or parts of metabolic pathways. In addition to 

25 expressing small target genes, such as cDNA and hybrid cDNA/artificial 
intron constructs, the chromosome-based platform can be used for 
engineering and expressing large genomic fragments carrying target genes 
along with its endogenous genomic promoter sequences. This is of 
importance, for example, where the therapy requires precise cell specific 
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expression and in instances where expression is best achieved from 
genomic clones rather than cDNA clones. Figure 9 provides a diagram 
summarizing one embodiment of the chromosome-based technology. 

A feature of the target gene expression vector that is of interest to 
5 include is a promoterless marker gene, which as exemplified (see, Figure 
9) contains an upstream attB site (marker 2 on Figure 9). The nucleic 
acid encoding the marker is not expressed unless it is placed downstream 
from a promoter sequence. Using the recombinase technology provided 
herein, such as the lambda integrase technology WINT E174R on figure 8) 
1 0 provided herein, site-specific recombination between the attB site on the 
vector and the promoter-attP site (in the "sense" orientation) on the 
chromosome-based platform results in the expression of marker 2 on the 
target gene expression vector, thereby providing a positive selection for 
the lambda INT mediated site-specific recombination event. Site-specific 
1 5 recombination events on the chromosome-based platform versus random 
integrations next to a promoter in the genome (false positive) can be 
quickly screened by designing primers to detect the correct event by PCR. 
Examples of suitable marker 2 genes, include, but are not limited to, 
genes that confer resistance to toxic compounds or antibiotics, 
20 fluorescence activated cell sorting (FACS) sortable cell surface markers 
and various fluorescent markers. Examples of these genes include, but 
are not limited to, human L26a R (human homolog of Saccharomyces 
cerevisiae CYH 8 gene), neomycin, puromycin, blasticidin, CD24 (see, e.g., 
US Patents 5,804,177 and 6,074,836), truncated CD4, truncated low 
25 affinity nerve growth factor receptor (LNGFR), truncated LDL receptor, 
truncated human growth hormone receptor, GFP, RFP, BFP. 

The target gene expression vectors contain a gene (target gene) for 
expression from the chromosome platform. The target gene can be 
expressed using various constitutive or regulated promoter systems 
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across various mammalian species. For the expression of multiple target 
genes within the same target gene expression vector, the expression of 
the multiple targets can be coordinated regulated via viral-based or 
human internal ribosome entry site (IRES) elements (see, e.g., Jackson et 
5 al. (1990) Trends Biochem Sci. 15: 477-83; Oumard eta/. (2000) Mol. 
Cell. Biol. 20: 2755-2759). Furthermore, using IRES type elements linked 
to a downstream fluorescent marker, e.g., green, red or blue fluorescent 
proteins (GFP, RFP, BFP) allows for the identification of high expressing 
clones from the integrated target gene expression vector. 
10 In certain embodiments described herein, the promoterless marker 

can be transcriptionally downstream of the heterologous nucleic acid, 
wherein the heterologous nucleic acid encodes a heterologous protein, 
and wherein the expression level of the selectable marker is 
transcriptionally linked to the expression level of the heterologous protein. 
15 In addition, the selectable marker and the heterologous nucleic acid can 
be transcriptionally linked by the presence of a IRES between them. As 
set forth herein the selectable marker is selected from the group 
consisting of an antibiotic resistance gene, and a detectable protein, 
wherein the detectable protein is chromogenic or fluorescent. 
20 Expression from the target gene expression vector integrated onto the 
chromosome-based platform can be further enhanced using genomic 
insulator/boundary elements. The incorporation of insulator sequences 
into the target gene expression vector helps define boundaries in 
chromatin structure and thus minimizes influence of chromatin position 
25 effects/gene silencing on the expression of the target gene (Bell et al. 

(1999) Current Opinion in Genetics and Development S:1 91-198; Emery 
et aL (2000) Proc. Natl. Acad. Sci. U.S.A. 37:9150-9155). Examples of 
insulator elements that can be included onto target gene expression 
vector in order to optimize expression include, but are not limited to: 
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1 > chicken /?-globin HS4 element (Prioleau et aL (1999) EMBO J 
18: 4035-4048); 

2) matrix attachment regions (MAR; see, e.g. , Ramakrishnan et 
aL (2000) Mol Cell. Biol. 20:868-877); 
5 3) scaffold attachment regions (SAR; see, e.g., Auten et al. 

(1999) Human Gene Therapy 70:1389-1399); and 
4) universal chromatin opening elements (UCOE; WO/0005393 
and WO/0224930) 

The copy number of the target gene can be controlled by 
TO sequentially adding multiple target gene expression vectors containing the 
target gene onto multiple integration sites on the chromosome platform. 
Likewise, the copy number of the target gene can be controlled within an 
individual target gene expression vector by the addition of DNA 
sequences that promote gene amplification. For example, gene 
15 amplification can be induced utilizing the dihydrofolate reductase (DHFR) 
minigene with subsequent selection with methotrexate (see, e.g., 
Schimke (1984) Cell 37:705-71 3) or amplification promoting sequences 
from the rDNA locus (see, e.g., Wegner et al. (1989) Nucl. Acids Res. 17: 
9909-9932). 

20 6. Platforms with other recombinase system sites 

A "double fox" targeting strategy mediated by Cre-recombinase 
(Bethke et aL (1997) Nucl. Acids Res. 25:2828-2834) can be used. This 
strategy employs a pair of heterospecific lox sites— loxA and loxB, which 
differ by one nucleotide in the 8 bp spacer region. Both sites are 

25 engineered into the artificial chromosome and also onto the targeting DNA 
vector. This allows for a direct site-specific insertion of a commercially 
relevant gene or genes by a Cre-catalyzed double crossover event. In 
essence a platform ACes is engineered with a hygromycin-resistance gene 
flanked by the double lox sites generating lox-ACes, which is maintained 
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in the thymidine kinase deficient cell, LMtk(-). The gene of interest, for 
example, for testing purposes, the green fluorescence protein gene, GFP 
and a HSV thymidine kinase gene <tk) marker, are engineered between the 
appropriate /ox sites of the targeting vector. The vector DNA is 
5 cotransfected with plasmid pBS185 (Life Technologies) encoding the Cre 
recombinase gene into mammalian cells maintaining the dual-/ox artificial 
chromosome. Transient expression of the Cre recombinase catalyzes the 
site-specific insertion of the gene and the tk-gene onto the artificial 
chromosome. The transfected cells are grown in HAT medium that 

10 selects for only those cells that have integrated and expressed the 

thymidine kinase gene. The HAT* colonies are screened by PCR analyses 
to identify artificial chromosomes with the desired insertion. 

To generate the lox-ACes, Lambda-Hyg R -/ox DNA is transfected 
into the LMtk(-) cell line harboring the precursor ACes. Hygromycin- 

1 5 resistant colonies are analyzed by FISH and Southern blotting for the 
presence of a single copy insert on the ACes. 

To demonstrate the gene replacement technology, cell lines 
containing candidate lox-ACes are cotransfected with pTK-GFP-/ox and 
pBS185 (encoding the Cre recombinase gene) DNA. After transfection, 

20 transient expression of plasmid pBS185 will provide sufficient burst of 

Cre recombinase activity to catalyze DNA recombination at the /ox sites. 
Thus, a double crossover event between the ACes target and the 
exogenous targeting plasmid carrying the /oxA and foxB permits the 
simple replacement of the hygromycin-resistance gene on the lox-ACes 

25 for the tk-GFP cassette from the targeting plasmid, with no integration of 
vector DNA. Transfected cells are grown in HAT-media to select for tk- 
expression. Correct targeting will result in the generation of HAT R , 
hygromycin sensitive, and green fluorescent cells. The desired integration 
event is verified by Southern and PCR analyses. Specific PCR primer sets 
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are used to amplify DNA sequences flanking the individual foxA and foxB 
sites on the fox-ACes before and after homologous recombination. 
D. Exemplary applications of the Platform ACes 

Platform ACes are applicable and tractable for different/optimized 
5 cell lines. Those that include a fluorescent marker, for example, can be 
purified and isolated using fluorescent activated cell sorting (FACS), and 
subsequently delivered to a target cell. Those with selectable markers 
provide for efficient selection and provide a growth advantage. Platform 
ACes allow multiple payload delivery of donor target vectors via a 
10 positive-selection site-specific, recombination system, and they allow for 
the inclusion of additional genetic factors that improve protein production 
and protein quality. 

The construction and use of the platform ACes as provided for 
each application may be similarly applied to other applications. Particular 
1 5 descriptions are for exemplification. 

1 . Cellular Protein Production Platform ACes (CPP ACes) 
As described herein, ACes can be produced from acrocentric 
chromosomes in rodent (mouse, hamster) cell lines via megareplicator 
induced amplification of heterochromatin/rDNA sequences. Such ACes 
20 are ideal for cellular protein production as well as other applications 

described herein and known to those of skill in the art. ACes platforms 
that contain a plurality of recombination sites are particularly suitable for 
engineering as cellular protein production systems. 

In one embodiment, CPP ACes involve a two-component system: 
25 the platform chromosome containing multiple engineering sites and the 

donor target vector containing a platform-specific recombination site with 
designed expression cassettes (see Figure 9). 

The platform ACes can be produced from any artificial 
chromosome, particularly the amplification-based artificial chromosomes- 
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For exemplification, they are produced from rodent artificial chromosomes 
produced from acrocentric chromosomes using the technology of U.S. 
Patent Nos. 6,077,697 and 6,025,155 and published International PCT 
application No. WO 97/40183, in which nucleic acid is targeted to the 

5 pericentric heterochromatic, and, particularly into rDNA to initiate the 
replication event(s). The ACes can be produced directly in the chosen 
cellular protein production cell lines, such as, but not limited to, CHO 
cells, hybridomas, plant cells, plant tissues, plant protoplasts, stem cells 
and plant calli. 

10 a. Platform Construction 

In the exemplary embodiment, the initial de novo platform 
construction requires co-transfecting with excess targeting DNA, such as, 
rDNA or lambda DNA without an attP region, and an engineered 
selectable marker. The engineered selectable marker should contain 

15 promoter, generally a constitutive promoter, such as human, viral, i.e., 
adenovirus or SV40 promoter, including the human ferritin heavy chain 
promoter (SEQ ID NO:128), SV40 and EFIor promoters, to control 
expression of a marker gene that provides a selective growth advantage 
to the cell. An example of such a marker gene is the E. coli hisD gene 

20 (encoding histidinol dehydrogenase) which is homologous and analogous 
to the S. typhimurium hisD a dominant marker selection system for 
mammalian cells previously described (see, Hartman eta/. (1988) Proc. 
Natl. Acad. Set. U.S.A. #5:8047-8051). Since histidine is an essential 
amino acid in mammals and a nutritional requirement in cell culture, the E. 

25 coli hisD gene can be used to select for histidine prototrophy in defined 
media. Furthermore more stringent selection can be placed on the cells 
by including histinol in the medium. Histidinol is itself permeable and 
toxic to cells. The hisD provides a means of detoxification. 
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Placed between the promoter and the marker gene is the 
bacteriophage lambda attP site to use the bacteriophage lambda integrase 
dependent site-specific recombination system (described herein). The 
insertion of an attP site downstream of a promoter element provide 
5 forward selection of site-specific recombination events onto the platform 
ACes. 

b. Donor Target Vector Construction 

A second component of the CPP platform ACes system involves 
the construction of donor target vectors containing a gene product(s) of 

10 interest for the CPP platform ACes. Individual donor target vectors can 
be designed for each gene product to be expressed thus enabling 
maximum usage of a de novo constructed platform ACes, so that one or 
a few CPP platform ACes will be required for many gene targets. 

A key feature of the donor vector target is the promoter/ess marker 

15 gene containing an upstream attB site (marker 2 on figure 9). Normally 
the marker would not be expressed unless it is placed downstream of a 
promoter sequence. As discussed above, using the lambda integrase 
technology WINT E174R on Figure 8 and Figure 9), site-specific 
recombination between the attB site on the vector and the promoter-affP 

20 site on the CPP platform ACes result in the expression of the donor target 
vector marker providing positive selection for the site-specific event. Site- 
specific recombination events on the CPP ACes versus random 
integrations next to a promoter in the genome (false positive) can be 
quickly screened by designing primers to detect the correct event by PCR. 

25 In addition, since the lambda integrase reaction is unidirectional, i.e. 
excision reaction is not possible, a number of unique targets can be 
loaded onto the CPP platform ACes limited only by the number of markers 
available. 
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Additional features of the donor target vector include gene target 
expression cassettes flanked by either chromatin insulator regions, matrix 
attachment regions (MAR) or scaffold attachment regions (SAR). The use 
of these regions will provide a more "open" chromatin environment for 
gene expression and help alleviate silencing. An example of such a 
cassette for expressing a monoclonal antibody is described. For this 
purpose, a strong constitutive promoter, e.g. chicken £-actin or RNA Poll, 
is used to drive the expression of the heavy and light chain open reading 
frames. The heavy and light chain sequences flank a nonattenuated 
human IRES (IRES H ; from the 5'UTR of NRF1 gene; see Oumard et a!., 
2000, MoL and Cell Biol. . 20(8) :2755-2759> element thereby 
coordinating transcription of both heavy and light chain sequence. Distal 
to the light chain open reading frame resides an additional viral encoded 
jRES (|RES V modified ECMV internal ribosomal entry site (IRES)) element 
attenuating the expression of the fluorescent marker gene hrGFP from 
Renilla (Stratagene). By linking the hrGFP with an attenuated IRES, the 
heavy and light chains along with the hrGFP are monocistronic. Thus, the 
identification of hrGFP fluorescing cells will provide a means to detect 
protein producing cells. In addition, high producing cell lines can be 
identified and isolated by FACS thereby decreasing the time frame in 
finding high expressers. Functional monoclonal antibody will be 
confirmed by ELISA. 

c. Additional components in cellular protein production 
platform ACes (CPP Aces) 
In addition to the aforementioned CPP ACes system, other genetic 
factors can be included to enhance the yield and quality of the expressed 
protein. Again to provide maximum flexibility, these additional factors 
can be inserted onto the CPP platform ACes by /UNTE174R dependent 
site-specific recombination. Other factors that could be used with a CPP 
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Platform ACes include for example, adenovirus E1a transactivation 

system which upregulates both cellular and viral promoters (see, e.g., 

Svensson and Akusjarvi (1984) EMBO 3:789-794; and US patents 

5,866,359; 4,775,630 and 4,920,211). 

5 d. Targets for GHO-ACes engineering to enhance cell 

growth, such as CHO cell growth and protein 
production/ quality 

If adding these additional factors onto the CPP ACes is not prudent 

or desired, the host cell, CHO cells, can be engineered to express these 

10 factors (see, below, targets for CHO-ACes engineering to enhance CHO 

cell growth and protein production/quality). Additional factors to consider 
including are addition of insulin or IGF-1 to sustain viabililty; 
human sialyltransferases or related factors to produce more human-like 
glycoproteins; expression of factors to decrease ammonium accumulation 

1 5 during cell growth; expression of factors to inhibit apoptosis; expression 

of factors to improve protein secretion and protein folding; and expression 

of factors to permit serum-free transfection and selection. 

1) Addition of insulin or IGF-1 to sustain 
viabililty 

20 Stimulatory factors and/or their receptors are expressed to set up 

an autocrine loop, to improve cell growth, such as CHO cell growth. Two 
exemplary candidates are insulin and IGF-1 (see, Biotechnol Prog 2000 
Sep;1 6(5):693-7). Insulin is the most commonly used growth factor for 
sustaining cell growth and viability in serum-free Chinese hamster ovary 

25 (CHO) cell cultures. Insulin and IGF-1 analog (LongR(3) serve as growth 
and viability factors for CHO cells. 

CHO cells were modified to produce higher levels of essential 
nutrients and factors. A serum-free (SF) medium for dihydrofolate 
reductase-deficient Chinese hamster ovary cells (DG44 cells) was 

30 prepared. Chinese hamster ovary cells (DG44 cells), which are normally 
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maintained in 10% serum medium, were gradually weaned to 0.5% 
serum medium to increase the probability of successful growth in SF 
medium (see, Kim et al. (199) In Vitro Cell Dev Biol Anim 35^:178-82). 
A SF medium (SF-DG44) was formulated by supplementing the basal 
5 medium with these components; basal medium was prepared by 

supplementing Dulbecco's modified Eagle's medium and Ham's nutrient 
mixture F12 with hypoxanthine (10 mgjfl) and thymidine {10 mg/l). 
Development of a SF medium for DG44 cells was facilitated using a 
Plackett-Burman design technique and weaning of cells. 

10 

2) Human sialyltransferases or related 
factors to produce more human-like 
glycoproteins 

CHO cells have been modified by increasing their ability to process 
15 protein via addition of complex carbohydrates. This has been achieved by 
overexpression of relevant processing enzymes, or in some cases, 
reducing expression of relevant enzymes (see, Bragonzi et al. (2000) 
Biochim Biophys Acta 1 474(3): 273-282; see, also Weikert et al. (1999) 
Nature biotech. 1 7:1 1 1 6-1 1 1 21 ; Ferrari J et al. (1 998) Biotechnol Bioeng 
20 60(5): 589-95). A CHO cell line expressing alpha2,6-sialyltransferase was 
developed for the production of human-like sialylated recombinant 
glycoproteins. The sialylation defect of CHO cells can be corrected by 
transfecting the aipha2,6-sialyltransferase (alpha2,6-ST) cDNA into the 
cells. Glycoproteins produced by such CHO cells display alpha2,6-and 
25 alpha2,3-linked terminal sialic acid residues, similar to human 
glycoproteins. 

As another example for improving the production of human-like 
sialylated recombinant glycoproteins, a CHO cell line has been developed 
that constitutively expresses siaiidase antisense RNA (see, Ferrari J et al. 
30 (1998) Biotechnol Bioeng 60/^:589-95). Several antisense expression 
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vectors were prepared using different regions of the sialidase gene. Co- 
transfection of the antisense constructs with a vector conferring 
purornycin resistance gave rise to over 40 puromycin resistant clones that 
were screened for sialidase activity. A 5' 474 bp coding segment of the 
5 sialidase cDNA, in the inverted orientation in an SV 40-based expression 
vector, gave maximal reduction of the sialidase activity to about 40% 
wild-type values. 

Oligosaccharide biosynthesis pathways in mammalian cells have 
been engineered for generation of recombinant glycoproteins (see, e.g. , 

10 Sburlati (1998) Biotechnol Prog 1 4(2)-A 89-92), which describes a Chinese 
hamster ovary (CHO) cell line capable of producing bisected 
oligosaccharides on glycoproteins. This cell line was created by 
overexpression of a recombinant N-acetylglucosaminyltransferase III (GnT- 
III) (see, also, Prati et al. (1998) Biotechnol Bioeng 5,9^:445-50, which 

15 describes antisense strategies for glycosylation engineering of CHO cells). 

3) Expression of factors to decrease 
ammonium accumulation during cell 
growth 

Excess ammonium, which is a by-product of CHO cell metabolism 
20 can have detrimental effects on cell growth and protein quality (see, Yang 
eta/. (2000) Biotechnol Bioeng 68(4) :370-&0). To solve this problem 
ammonium levels were modified by overexpressing carbamoyl phosphate 
synthetase I and ornithine transcarbamoylase or glutamine synthetase in 
CHO cells. Such modification resulted in reduced ammonium levels 
25 observed and an increase in the growth rate (see Kim et al. (200O) J 
Biotechnol 81 (2-3) A 29-40; and Enosawa et al. (1997) Cell Transplant 
fff5J:537-40). 

4) Expression of factors to improve protein 
secretion and protein folding 




-70- 

Overexpression of relevant enzymes can be engineered into the 

ACes to improve protein secretion and folding. 

5) Expression of factors to permit serum-free 
transfection and selection 

5 It is advantageous to have the ability to convert CHO cells in 

suspension growing in serum free medium to adherence with out having 

to resort to serum addition. Laminin or fibronectin addition is sufficient to 

make cells adherent (see, e.g., Zaworski et aL (1993) Biotechniques 

75(5):363-G) so that expressing either of these genes in CHO cells under 

10 an inducible promoter should allow for reversible shift to adherence 

without requiring serum addition. 

2. Platform ACes and Gene Therapy 

The platform ACes provided herein are contemplated for use in 
mammalian gene therapy, particularly human gene therapy. Human ACes 

1 5 can be derived from human acrocentric chromosomes from human host 

cells, in which the amplified sequences are heterochromatic and/or human 
rDNA. Different platform ACes applicable for different tissue cell types 
are provided. The ACes for gene therapy can contain a single copy of a 
therapeutic gene inserted into a defined location on platform ACes. 

20 Therapeutic genes include genomic clones, cDNA, hybrid genes and other 
combinations of sequences. Preferred selectable markers are those from 
the mammalian host, such as human derived factors so that they are non- 
immunogenic, non-toxic and allow for efficient selection, such as by 
FACS and/or drug resistance. 

25 Platform ACes, useful for gene therapy and other applications, as 

noted herein, can be generated by megareplicator dependent 
amplification, such as by the methods in U.S. Patent Nos. 6,077,697 and 
6,025,155 and published International PCT application No. 
WO 97/40183. In one embodiment, human ACes are produced using 
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human rDNA constructs that target rDNA arrays on human acrocentric 
chromosomes and induce the megareplicator in human cells, particularly 
in primary cell lines (with sufficient number of doublings to form the 
ACes) or stem cells (such as hematopoietic stem cells, mesenchymal 
5 stem cells, adult stem cells or embryonic stem cells) to avoid the 

introduction of potentially harmful rearranged DNA sequences present in 
many transformed cell lines. Megareplicator induced ACes formation can 
result in multiple copies of targeting DNA/selectable markers in each 
amplification block on both chromosomal arms of the platform ACes. 

10 In view of the considerations regarding immunogenicity and 

toxicity, the production of human platform ACes for gene therapy 
applications employs a two component system analogous to the platform 
ACes designed for cellular protein production (CPP platform ACes). The 
system includes a platform chromosome of entirely human DNA origin 

1 5 containing multiple engineering sites and a gene target vector carrying the 
therapeutic gene of interest. 

a. Platform Construction 
The initial de novo construction of the platform chromosome 
employs the co-transfection of excess targeting DNA and a selectable 

20 marker. In one embodiment, the DNA is targeted to the rDNA arrays on 
the human acrocentric chromosomes (chromosomes 13, 14, 15, 21 and 
22). For example, two large human rDNA containing PAC clones 18714 
and 1 8720 and the human PAC clone 558F8 are used for targeting 
(Genome Research (ML) now Incyte, BACPAC Resources, 747 52nd 

25 Street, Oakland CA). The mouse rDNA clone pFK161 (SEQ ID NO: 118), 
which was used to make the human SATAC from the 94-3 
hamster/human hybrid cell line (see, e.g., published International PCT 
application No. WO 97/40183 and Csonka, et at. Journal of Cell Science 
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7/3:3207-32161 and Example 1 for a description of pFK161) can also be 
used. 

For animal applications, selectable markers should be non- 
immunogenic in the animal, such as a human, and include, but are not 
5 limited to: human nerve growth factor receptor (detected with a MAb, 
such as described in US patent 6,365,373); truncated human growth 
factor receptor (detected with MAb), mutant human dihydrofolate 
reductase (DHFR; fluorescent MTX substrate available); secreted alkaline 
phosphatase (SEAP; fluorescent substrate available); human thymidylate 

10 synthase (TS; confers resistance to anti-cancer agent fluorodeoxyuridine); 
human glutathione S-transferase alpha (GSTA1; conjugates glutathione to 
the stem cell selective alkylator busulfan; chemoprotective selectable 
marker in CD34+ cells); CD24 cell surface antigen in hematopoietic stem 
cells; human CAD gene to confer resistance to N-phosphonacetyl-L- 

15 aspartate (PALA); human multi-drug resistance-1 (MDR-1; P-glycoprotein 
surface protein selectable by increased drug resistance or enriched by 
FACS); human CD25 (IL-2a; detectable by Mab-FITC); Methylguanine- 
DNA methyltransferase (MGMT; selectable by carmustine); and Cytidine 
deaminase (CD; selectable by Ara-C). 

20 Since rnegareplicator induced amplification generates multiple 

copies of the selectable marker, a second consideration for the selection 
of the human marker is the resulting dose of the expressed marker after 
ACes formation. High level of expression of certain markers may be 
detrimental to the cell and/or result in autoimmunity. One method to 

25 decrease the dose of the marker protein is by shortening its half-life, such 
as via the fusion of the well-conserved human ubiquitin tag (a 76 amino 
acid sequence) thus leading to increased turnover of the selectable 
marker. This has been used successfully for a number of reporter 
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systems including DHFR (see, e.g., Stack eta/. (2000) Nature 
Biotechnology 78: 1298-1 302 and references cited therein). 

Using the ubiquitin tagged protein, a human selectable marker 
system analogous to the CPP ACes described herein is constructed. 
Briefly, a tagged selectable marker, such as for example one of those 
described herein, is cloned downstream of an attP site and expressed 
from a human promoter. Exemplary promoters contemplated for use 
herein include, but are not limited to, the human ferritin heavy chain 
promoter (SEQ ID NO:128); RNA Poll; EF1a; TR; glyceraldehyde-3- 
phosphate dehydrogenase core promoter (GAP); a GAP core promoter 
including a proximal insulin inducible element and the intervening GAP 
sequence; phosphofructokinase promoter; and phosphoglycerate kinase 
promoter. Also contemplated herein is an aldolase A promoter H1 & H2 
(representing closely spaced transcriptional start sites) along with the 
proximal H enhancer. There are 4 promoters (e.g., transcriptional start 
sites) for this gene, each having different regulatory and tissue activity. 
The H (most proximal 2) promoters are ubiquitously expressed off the H 
enhancer. This resulting marker can then be co-transfected along with 
excess human rDNA targeting sequence into the host cells. An important 
criteria for the selection of the 

recipient cells is sufficient number of cell doublings for the formation and 
detection of ACes. Accordingly/ the co-transfections should be 
attempted in human primary cells that can be cultured for long periods of 
time, such as for example, stem cells (e.g., hematopoietic, mesenchymal, 
adult or embryonic stem cells), or the like. Additional cell types, include, 
but are not limited to: single gene transfected cells exhibiting increased 
life-span; over-expressing c-myc cells, e.g. MSU1.1 (Morgan et al., 1991, 
Exp. Cell Res., Nov;1 97(1 ): 1 25-1 36); over-expressing telomerase lines, 
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such as TERT cells; SV40 large T-antigen transfected lines; tumor cell 
lines, such as HT1080; and hybrid human cell lines, such as the 94-3 
harnster/human hybrid cell line. 

b. Gene Target Vector 
5 The second component of the GT platform ACes (GT ACes) system 

involves the use of engineered target vectors carrying the therapeutic 
gene of interest. These are introduced onto the GT platform ACes via 
site-specific recombination. As with the CPP ACes, the use of engineered 
target vectors maximizes the use of the de novo generated GT platform 

10 ACes for most gene targets. Furthermore, using lambda integrase 

technology, GT platform ACes containing multiple attP sites permits the 
opportunity to incorporate multiple therapeutic targets onto a single 
platform. This could be of value in cases where a defined therapy 
requires multiple gene targets, a single therapeutic target requires an 

15 additional gene regulatory factor or a GT ACes requires a "kill" switch. 

Similar to the CPP ACes, a feature of the gene target vector is the 
promoterless marker gene containing an upstream attB site (marker 2 on 
' Figure 9). Normally, the marker (in this case, a cell surface antigen that 
can be sorted by FACS would be ideal) would not be expressed unless it 

20 is placed downstream of a promoter sequence. Using the lambda 

integrase technology MINT E174R on figure 9), site-specific recombination 
between the attB site on the vector and the promoter- attP site on the GT 
platform ACes results in the expression of marker#2 on the gene target 
vector, i.e. positive selection for the site-specific event. Site-specific 

25 recombination events on the GT ACes versus random integrations next to 
a promoter in the genome (false positive) can be quickly screened by 
designing primers to detect the correct event by PCR. 

For expression of the therapeutic gene, human specific promoters, 
such as a ferritin heavy chain promoter (SEQ ID NO: 128); EF1or or RNA 
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Poll, are used. These promoters are for high level expression of a cDNA 
encoded therapeutic protein. In addition to expressing cDNA {or even 
hybrid cDNA/artificial intron constructs), the GT platform ACes are used 
for engineering and expressing large genomic fragments carrying 
5 therapeutic genes of interest expressed from native promoter sequences. 
This is of importance in situations where the therapy requires precise cell 
specific expression or in instances where expression is best achieved 
from genomic clones versus cDNA. 

3. Selectable markers for use, for example, in Gene 
10 Therapy (GT) 

The following are selectable markers that can be incorporated into 
human ACes and used for selection. 

Dual Resistance to 4-Hydroperoxycyclophosphamide 
and Methotrexate by Retroviral Transfer of the Human 
1 5 Aldehyde Dehydrogenase Class 1 Gene and a Mutated 

Dihydrofolate Reductase Gene 

The genetic transfer of drug resistance to hematopoietic cells is one 
approach to overcoming myelosuppression caused by high-dose 
chemotherapy. Because cyclophosphamide (CTX) and methotrexate 

20 (MTX) are commonly used non-cross-resistant drugs, generation of dual 

drug resistance in hematopoietic cells that allows dose intensification may 
increase anti-tumor effects and circumvent the emergence of drug- 
resistant tumors, a retroviral vector containing a human cytosolic ALDH- 
1 -encoding DNA clone and a human doubly mutated DHFR-encoding 

25 clone (Phe22/Ser31 ; termed F/S in the description of constructs) to 

generate increased resistance to CTX and MTX were constructed (Takebe 
et aL (2001) Mof Ther 3f7J:88-96). This construct may be useful for 
protecting patients from high-dose CTX- and MTX-induced 
myelosuppression. ACes can be similarly constructed. 
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Multiple mechanisms of N-phosphonacetyl-L-aspartate 
resistance in human cell lines: carbamyl-P 
synthetase/aspartate transcarbamylase/dihydro-orotase 
gene amplification is frequent only when chromosome 
5 2 is rearranged 

Rodent cells resistant to N-phosphonacetyl-L-aspartate (PALA) 

invariably contain amplified carbamyl-P synthetase/aspartate 

transcarbamylase/dihydro-orotase (CAD) genes, usually in widely spaced 

tandem arrays present as extensions of the same chromosome arm that 

10 carries a single copy of CAD in normal cells (Smith et al. (1997) Proc. 

Natl. Acad. Sci. U.S.A. 3^:1816-21). In contrast, amplification of CAD is 
very infrequent in several human tumor cell lines. Cell lines with minimal 
chromosomal rearrangement and with unrearranged copies of 
chromosome 2 rarely develop intrachromosomal amplifications of CAD. 

1 5 These cells frequently become resistant to PALA through a mechanism 

that increases the aspartate transcarbamylase activity with no increase in 
CAD copy number, or they obtain one extra copy of CAD by forming an 
isochromosome 2p or by retaining an extra copy of chromosome 2. In 
cells with multiple chromosomal aberrations and rearranged copies of 

20 chromosome 2, amplification of CAD as tandem arrays from rearranged 
chromosomes is the most frequent mechanism of PALA resistance. All of 
these different mechanisms of PALA resistance are blocked in normal 
human fibroblasts. Thus, ACes with multiple copies of the CAD gene 
would provide PALA resistance. 

25 Retroviral coexpression of thymidylate synthase and 

dihydrofolate reductase confers fluoropyrimidine and 
antifolate resistance 

Retroviral gene transfer of dominant selectable markers into 

hematopoietic cells can be used to select genetically modified cells in vivo 

30 or to attenuate the toxic effects of chemotherapeutic agents. Fantz et al. 

((1998) Biochem Biophys Res Comm 243(71:3-12) have shown that 
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retroviral gene transfer of thymidylate synthase (TS) confers resistance to 
TS directed anticancer agents and that co-expression of TS and 
dihydrofolate reductase (DHFR) confers resistance to TS and DHFR 
cytotoxic agents. Retroviral vectors encoding Escherichia co/i TS, human 
5 TS, and the Tyr-to-His at residue 33 variant of human TS (Y33HhTS) 

were constructed and fibroblasts transfected with these vectors conferred 
comparable resistance to the TS-directed agent fluorodeoxyuridine 
(FdUrd, approximately 4-fold). Retroviral vectors that encode dual 
expression of Y33HhTS and the human L22Y DHFR (L22YhDHFR) 

10 variants conferred resistance to FdUrd (3- to 5-fold) and trimetrexate (30- 
to 140-fold). A L22YhDHFR-Y33HhTS chimeric retroviral vector was also 
constructed and transduced cells were resistant to FdUrd (3-fold), AG337 
(3-fold), trimetrexate (100-fold) and methotrexate (5-fold). These results 
show that recombinant retroviruses can be used to transfer the cDNA 

15 that encodes TS and DHFR and dual expression in transduced cells is 

sufficiently high to confer resistance to TS and DHFR directed anticancer 

agents. ACes can be similarly constructed. 

Human CD34+- cells do not express glutathione S- 
transferases alpha 

20 The expression of glutathione S-transferases alpha (GST alpha) in 

human hematopoietic CD34-h cells and bone marrow was studied using 
RT-PCR and immunoblotting (Czerwinski M, Kiem et aL (1997) Gene Ther 
4(3):268-70). The GSTA1 protein conjugates glutathione to the stem cell 
selective alkylator busulfan. This reaction is the major pathway of 

25 elimination of the compound from the human body. Human hematopoietic 
CD34 4- cells and bone marrow do not express GSTA1 message, which 
was present at a high level in liver, an organ relatively resistant to 
busulfan toxicity in comparison to bone marrow. Similarly, baboon 
CD34+ cells and dog bone marrow do not express GSTA1 . Thus, human 
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GSTA1 is a chemoprotective selectable marker in human stem cell gene 

therapy and could be employed in ACes construction. 

Selection of retrovirally transduced hematopoietic cells 
using CD24 as a marker of gene transfer 

5 Pawliuk eta/, {(1994) Blood 84(9) 12338-2377) have investigated 

the use of a cell surface antigen as a dominant selectable marker to 

facilitate the detection and selection of retrovirally infected target cells. 

The small coding region of the human cell surface antigen CD24 

(approximately 240 bp) was introduced into a myeloproliferative sarcoma 

10 virus (MPSV)-based retroviral vector, which was then used to infect day 4 
5-fluorouracil (5-FU)-treated murine bone marrow cells. Within 48 hours 
of termination of the infection procedure CD24-expressing cells were 
selected by fluorescent-activated cell sorting (FACS) with an antibody 
directed against the CD24 antigen. Functional analysis of these cells 

15 showed that they included not only in vitro clonogenic progenitors and 
day 12 colony-forming unit-spleen but also cells capable of competitive 
long-term hematopoietic repopulation. Double-antibody labeling studies 
performed on recipients of retrovirally transduced marrow cells showed 
that some granulocytes, macrophages, erythrocytes, and, to a lesser 

20 extent, B and T lymphocytes still expressed the transduced CD24 gene at 
high levels 4 months later. No gross abnormalities in hematopoiesis were 
detected in mice repopulated with CD24-expressing cells. These results 
show that the use of the CD24 cell surface antigen as a retrovirally 
encoded marker permits rapid, efficient, and nontoxic selection in vitro of 

25 infected primary cells, facilitates tracking and phenotyping of their 
progeny, and provides a tool to identify elements that regulate the 
expression of transduced genes in the most primitive hematopoietic cells. 
ACes could be similarly constructed. 
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DeitahGHR, a biosafe cell surface-labeling molecule for 
analysis and selection of genetically transduced human 
cells 

A selectable marker for retroviral transduction and selection of 
5 human and murine cells is known (see, Garcia-Ortiz eta/. (2000) Hum 

Gene Ther 1 7(2;:333-46). The molecule expressed on the cell surface of 
the transduced population is a truncated version of human growth 
hormone receptor (deltahGHR), capable of ligand (hGH) binding, but 
devoid of the domains involved in signal triggering. The engineered 

10 molecule is stably expressed in the target cells as an inert protein unable 
to trigger proliferation or to rescue the cells from apoptosis after ligand 
binding. This new marker, has a wide application spectrum, since hGHR 
in the human adult is highly expressed only in liver cells, and lower levels 
have been reported in certain lymphocyte cell populations. The 

15 deltahGHR label has high biosafety potential, as it belongs to a well- 
characterized hormonal system that is nonessential in adults, and there is 
extensive clinical experience with hGH administration in humans. The 
differential binding properties of several monoclonal antibodies (MAbs) are 
used in a cell rescue method in which the antibody used to select 

20 deltahGHR-transduced cells is eluted by competition with hGH or, 

alternatively biotinylated hGH is used to capture tagged cells. In the latter 

system, the final purified population is recovered free of attached 

antibodies in hGH (a substance approved for human use)-containing 

medium. Such a system could be used to identify ACes containing cells. 

25 4. Transgenic models for evaluation of genes and 

discovery of new traits in plants 

Of interest is the use of plants and plant cells containing artificial 

chromosomes for the evaluation of new genetic combinations and 

discovery of new traits. Artificial chromosomes, by virtue of the fact that 

30 they can contain significant amounts of DNA can also therefore encode 
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numerous genes and accordingly a multiplicity of traits. It is 
contemplated here that artificial chromosomes, when formed from one 
plant species, can be evaluated in a second plant species. The resultant 
phenotypic changes observed, for example, can indicate the nature of the 
5 genes contained within the DNA contained within the artificial 

chromosome, and hence permit the identification of novel genetic 
activities. Artificial chromosomes containing euchromatic DNA or partially 
containing euchromatic DNA can serve as a valuable source of new traits 
when transferred to an alien plant cell environment. For example, it is 

10 contemplated that artificial chromosomes derived from dicot plant species 
can be introduced into monocot plant species by transferring a dicot 
artificial chromosome. The dicot artificial chromosome possessing a 
region of euchromatic DNA containing expressed genes. 

The artificial chromosomes can be designed to allow the artificial 

15 chromosome to recombine with the naturally occurring plant DNA in such 
a fashion that a large region of naturally occurring plant DNA becomes 
incorporated into the artificial chromosome. This allows the artificial 
chromosome to contain new genetic activities and hence carry novel 
traits. For example, an artificial chromosome can be introduced into a 

20 wild relative of a crop plant under conditions whereby a portion of the 

DNA present in the chromosomes of the wild relative is transferred to the 
artificial chromosome. After isolation of the artificial chromosome, this 
naturally occurring region of DNA from the wild relative, now located on 
the artificial chromosome can be introduced into the domesticated crop 

25 species and the genes encoded within the transferred DNA expressed and 
evaluated for utility. New traits and gene systems can be discovered in 
this fashion. The artificial chromosome can be modified to contain 
sequences that promote homologous recombination within plant cells, or 
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be modified to contain a genetic system that functions as a site-specific 
recombination system. 

Artificial chromosomes modified to recombine with plant DNA offer 
many advantages for the discovery and evaluation of traits in different 
5 plant species. When the artificial chromosome containing DNA from one 
plant species is introduced into a new plant species, new traits and genes 
can be introduced. This use of an artificial chromosome allows for the 
ability to overcome the sexual barrier that prevents transfer of genes from 
one plant species to another species. Using artificial chromosomes in this 

lO fashion allows for many potentially valuable traits to be identified 

including traits that are typically found in wild species. Other valuable 
applications for artificial chromosomes include the ability to transfer large 
regions of DNA from one plant species to another, such as DNA encoding 
potentially valuable traits such as altered oil, carbohydrate or protein 

15 composition, multiple genes encoding enzymes capable of producing 

valuable plant secondary metabolites, genetic systems encoding valuable 
agronomic traits such as disease and insect resistance, genes encoding 
functions that allow association with soil bacterium such as growth 
promoting bacteria or nitrogen fixing bacteria, or genes encoding traits 

20 that confer freezing, drought or other stress tolerances. In this fashion, 
artificial chromosomes can be used to discover regions of plant DNA that 
encode valuable traits. 

The artificial chromosome can also be designed to allow the 
transfer and subsequent incorporation of these valuable traits now located 

25 on the artificial chromosome into the natural chromosomes of a plant 
species. In this fashion the artificial chromosomes can be used to 
transfer large regions of DNA encoding traits normally found in one plant 
species into another plant species. In this fashion, it is possible to derive 
a plant cell that no longer needs to carry an artificial chromosome to 
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posses the novel trait. Thus, the artificial chromosome would serve as 
the transfer mechanism to permit the formation of plants with greater 
degree of genetic diversity. 

The design of an artificial chromosome to accomplish the afore- 
mentioned purposes can include within the artificial chromosome the 
presence of specific DNA sequences capable of acting as sites for 
homologous recombination to take place. For example, the DNA 
sequence of Arabidopsis is now known. To construct an artificial 
chromosome capable of recombining with a specific region of Arabidopsis 
DNA, a sequence of Arabidopsis DNA, normally located near a 
chromosomal location encoding genes of potential interest can be 
introduced into an artificial chromosome by methods provided herein. It 
may be desirable to include a second region of DNA within the artificial 
chromosome that provides a second flanking sequence to the region 
encoding genes of potential interest, to promote a double recombination 
event which would ensure transfer of the entire chromosomal region, 
encoding genes of potential interest, to the artificial chromosome. The 
modified artificial chromosome, containing the DNA sequences capable of 
homologous recombination region, can then be introduced into 
Arabidopsis cells and the homologous recombination event selected. 

It is convenient to include a marker gene to allow for the selection 
of a homologous recombination event. The marker gene is preferably 
inactive unless activated by an appropriate homologous recombination 
event. For example, US 5,272,071, describes a method where an 
inactive plant gene is activated by a recombination event such that 
desired homologous recombination events can be easily scored. Similarly, 
US 5,501,967 describes a method for the selection of homologous 
recombination events by activation of a silent selection gene first 
introduced into the plant DNA, the gene being activated by an appropriate 
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homologous recombination event. Both of these methods can be applied 
to enable a selective process to be included to select for recombination 
between an artificial chromosome and a plant chromosome. Once the 
homologous recombination event is detected, the artificial chromosome, 
5 once selected, is isolated and introduced into a recipient cell, for example, 
tobacco, corn, wheat or rice, and the expression of the newly introduced 
DNA sequences evaluated. 

Phenotypic changes in the recipient plant cells containing the 
artificial chromosome, or in regenerated plants containing the artificial 

10 chromosome, allows for the evaluation of the nature of the traits encoded 
by the Arabidopsis DNA, under conditions naturally found in plant cells, 
including the naturally occurring arrangement of DNA sequences 
responsible for the developmental control of the traits in the normal 
chromosomal environment. 

1 5 Traits such as durable fungal or bacterial disease resistance, new 

oil and carbohydrate compositions, valuable secondary metabolites such 
as phytosterols, flavonoids, efficient nitrogen fixation or mineral 
utilization, resistance to extremes of drought, heat or cold are all found 
within different populations of plant species and are often governed by 

20 multiple genes. The use of single gene transformation technologies does 
not permit the evaluation of the multiplicity of genes controlling many 
valuable traits. Thus, incorporation of these genes into artificial 
chromosomes allows the rapid evaluation of the utility of these genetic 
combinations in heterologous plant species. 

25 The large scale order and structure of the artificial chromosome 

provides a number of unique advantages in screening for new utilities or 
novel phenotypes within heterologous plant species. The size of new 
DNA that can be carried by an artificial chromosome can be millions of 
base pairs of DNA, representing potentially numerous genes that may 
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have novel utility in a heterologous plant cell. The artificial chromosome 
is a "natural" environment for gene expression, the problems of variable 
gene expression and silencing seen for genes transferred by random 
insertion into a genome should not be observed. Similarly, there is no 
5 need to engineer the genes for expression, and the genes inserted would 
not need to be recombinant genes. Thus, one expects the expression 
from the transferred genes to be temporal and spatial, as observed in the 
species from where the genes were initially isolated. A valuable feature 
for these utilities is the ability to isolate the artificial chromosomes and to 
10 further isolate, manipulate and introduce into other cells artificial 
chromosomes carrying unique genetic compositions. 

Thus, the use of artificial chromosomes and homologous 
recombination in plant cells can be used to isolate and identify many 
valuable crop traits. 

15 In addition to the use of artificial chromosomes for the isolation and 

testing of large regions of naturally occurring DNA, methods for the use 
of artificial chromosomes and cloned DNA are also contemplated. Similar 
to that described above, artificial chromosomes can be used to carry large 
regions of cloned DNA, including that derived from other plant species. 

20 The ability to incorporate novel DNA elements into an artificial 

chromosome as it is being formed allows for the development of artificial 
chromosomes specifically engineered as a platform for testing of new 
genetic combinations, or "genomic" discoveries for model species such as 
Arabidopsis. It is known that specific "recombinase" systems can be 

25 used in plant cells to excise or re-arrange genes. These same systems 
can be used to derive new gene combinations contained on an artificial 
chromosome. 

The artificial chromosomes can be engineered as platforms to 
accept large regions of cloned DNA, such as that contained in Bacterial 
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Artificial Chromosomes (BACs) or Yeast Artificial Chromosomes (YACs). 
It is further contemplated, that as a result of the typical structure of 
artificial chromosomes containing tandemly repeated DNA blocks, that 
sequences other than cloned DNA sequence can be introduced by 
5 recombination processes. In particular recombination within a predefined 
region of the tandemly repeated DNA within the artificial chromosome 
provides a mechanism to "stack" numerous regions of cloned DNA, 
including large regions of DNA contained within BACs or YACs clones. 
Thus, multiple combinations of genes can be introduced onto artificial 

10 chromosomes and these combinations tested for functionality. In 

particular, it is contemplated that multiple YACs or BACs can be stacked 
onto an artificial chromosomes, the BACs or YACs containing multiple 
genes of complex pathways or multiple genetic pathways. The BACs or 
YACs are typically selected based on genetic information available within 

15 the public domain, for example from the Arabidopsis Information 

Management System (http://aims.cps.msu.edu/aims/index.html) or the 
information related to the plant DNA sequences available from the 
Institute for Genomic Research (http://www.tigr.org) and other sites 
known to those skilled in the art. Alternatively, clones can be chosen at 

20 random and evaluated for functionality. It is contemplated that 

combinations providing a desired phenotype can be identified by isolation 
of the artificial chromosome containing the combination and analyzing the 
nature of the inserted cloned DNA. 

In this regard, it is contemplated that the use of site-specific 

25 recombination sequences can have considerable utility in developing 
artificial chromosomes containing DNA sequences recognized by 
recombinase enzymes and capable of accepting DNA sequences 
containing same. The use of site-specific recombination as a means to 
target an introduced DNA to a specific locus has been demonstrated in 
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the art and such methods can be employed. The recombinase systems 
can also be used to transfer the cloned DNA regions contained within the 
artificial chromosome to the naturally occurring plant or mammalian 
chromosomes. 

5 As noted herein, many site-specific recombinases are known and 

can be identified (Kilby et al. (1993) Trends in Genetics ,9:413-418). The 
three recombinase systems that have been extensively employed include: 
an activity identified as R encoded by the pSR1 plasmid of 
Zygosaccharomyes rouxii, FLP encoded for the 2um circular plasmid from 

1 0 Saccharomyces cerevisiae and Cre-lox from the phage PI . 

The integration function of site-specific recombinases is 
contemplated as a means to assist in the derivation of genetic 
combinations on artificial chromosomes. In order to accomplish this, it is 
contemplated that a first step of introducing site-specific recombinase 

1 5 sites into the genome of a plant cell in an essentially random manner is 
conducted, such that the plant cell has one or more site-specific 
recombinase recognition sequences on one or more of the plant 
chromosomes. An artificial chromosome is then introduced into the plant 
cell, the artificial chromosome engineered to contain a recombinase 

20 recognition site (e.g., integration site) capable of being recognized by a 
site-specific recombinase. Optionally, a gene encoding a recombinase 
enzyme is also included, preferably under the control of an inducible 
promoter. Expression of the site-specific recombinase enzyme in the 
plant cell, either by induction of a inducible recombinase gene, or 

25 transient expression of a recombinase sequence, causes a site-specific 
recombination event to take place, leading to the Insertion of a region of 
the plant chromosomal DNA (containing the recombinase recognition site) 
into the recombinase recognition site of the artificial chromosome, and 
forming an artificial chromosome containing plant chromosomal DNA. 
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The artificial chromosome can be isolated and introduced into a 
heterologous host, preferably a plant host, and expression of the newly 
introduced plant chromosomal DNA can be monitored and evaluated for 
desirable phenotypic changes. Accordingly, carrying out this 
5 recombination with a population of plant cells wherein the chromosomally 
located recombinase recognition site is randomly scattered throughout the 
chromosomes of the plant, can lead to the formation of a population of 
artificial chromosomes, each with a different region of plant chromosomal 
DNA, and each potentially representing a novel genetic combination. 

10 This method requires the precise site-specific insertion of 

chromosomal DNA into the artificial chromosome. This precision has 
been demonstrated in the art. For example, Fukushige and Sauer ((1992) 
Proc. Natl. Acad. Sci. USA, 89:7905-7909) demonstrated that the Cre- 
lox homologous recombination system could be successfully employed to 

1 5 introduce DNA into a predefined locus in a chromosome of mammalian 
cells. In this demonstration a promoter-less antibiotic resistance gene 
modified to include a lox sequence at the 5' end of the coding region was 
introduced into CHO cells. Cells were re-transformed by electroporation 
with a plasmid that contained a promoter with a fox sequence and a 

20 transiently expressed Cre recombinase gene. Under the conditions 

employed, the expression of the Cre enzyme catalyzed the homologous 
recombination between the lox site in the chromosomally located 
promoter-less antibiotic resistance gene, and the lox site in the introduced 
promoter sequence, leading to the formation of a functional antibiotic 

25 resistance gene. The authors demonstrated efficient and correct targeting 
of the introduced sequence, 54 of 56 lines analyzed corresponded to the 
predicted single copy insertion of the DNA due to Cre catalyzed site- 
specific homologous recombination between the lox sequences. 
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Accordingly a /ox sequence may be first added to a genome of a 
plant species capable of being transformed and regenerated to a whole 
plant to serve as a recombinase target DNA sequence for recombination 
with an artificial chromosome. The fox sequence may be optimally 
5 modified to further contain a selectable marker which is inactive but can 
be activated by insertion of the /ox recombinase recognition sequence into 
the artificial chromosome. 

A promoterless marker gene or selectable marker gene linked to the 
recombinase recognition sequence, which is first inserted into the 

10 chromosomes of a plant cell can be used to engineer a platform 

chromosome. A promoter is linked to a recombinase recognition site, in 
an orientation that allows the promoter to control the expression of the 
marker or selectable marker gene upon recombination within the artificial 
chromosome. Upon a site-specific recombination event between a 

15 recombinase recognition site in a plant chromosome and the recombinase 
recognition site within the introduced artificial chromosome, a cell is 
derived with a recombined artificial chromosome, the artificial 
chromosome containing an active marker or selectable marker activity 
that permits the identification and or selection of the cell. 

20 The artificial chromosomes can be transferred to other plant or 

animal species and the functionality of the new combinations tested. The 
ability to conduct such an inter-chromosomal transfer of sequences has 
been demonstrated in the art. For example, the use of the Cre-/ox 
recombinase system to cause a chromosome recombination event 

25 between two chromatids of different chromosomes has been shown. 

Any number of recombination systems may be employed as 
described herein, such as, but not limited to, bacterially derived systems 
such as the att/int system of phage lambda, and the Gin/gix system. 
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More than one recombination system may be employed, including, 
for example, one recombinase system for the Introduction of DNA into an 
artificial chromosome, and a second recombinase system for the 
subsequent transfer of the newly introduced DNA contained within an 
5 artificial chromosome into the naturally occurring chromosome of a 

second plant species. The choice of the specific recombination system 
used will be dependent on the nature of the modification contemplated. 

By having the ability to isolate an artificial chromosome, in 
particular, artificial chromosomes containing plant chromosomal DNA 

10 introduced via site-specific recombination, and re-introduce the 

chromosome into other mammalian or plant cells, particularly plant cells, 
these new combinations can be evaluated in different crop species 
without the need to first isolate and modify the genes, or carry out 
multiple transformations or gene transfers to achieve the same 

1 5 combination isolation and testing combinations of the genes in plants. 
The use of a site-specific recombinase also allows the convenient 
recovery of the plant chromosomal region into other recombinant DNA 
vectors and systems, such as mammalian or insect systems, for 
manipulation and study. 

20 Also contemplated herein are ACes, cell lines and methods for use 

in screening a new chromosomal combinations, deletions, truncations 
with eucaryotic genome that take advantage of the site-specific 
recombination systems incorporated onto platform ACes provided herein. 
For example, provided herein is a cell line useful for making a library of 

25 ACes, comprising a multiplicity of heterologous recombination sites 
randomly integrated throughout the endogenous chromosomes. Also 
provided herein is a method of making a library of ACes comprising 
random portions of a genome, comprising introducing one or more ACes 
into a cell line comprising a multiplicity of heterologous recombination 
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sites randomly integrated throughout the endogenous chromosomes, 
under conditions that promote the site-specific chromosomal arm 
exchange of the ACes into, and out of, a multiplicity of the heterologous 
recombination sites within the cell's chromosomal DNA; and isolating said 
5 multiplicity of ACes, thereby producing a library of ACes whereby multiple 
ACes have different portions of the genome within. Also provided herein 
is a library of cells useful for genomic screening, said library comprising a 
multiplicity of cells, wherein each cell comprises an ACes having a 
mutually exclusive portion of a chromosomal nucleic acid therein- The 

10 library of cells can be from a different species and/or cell type than the 

chromosomal nucleic acid within the ACes. Also provided is a method of 
making one or more cell lines, comprising 

a) integrating into endogenous chromosomal DNA of a selected cell 
species, a multiplicity of heterologous recombination sites, 

15 b) introducing a multiplicity of ACes under conditions that promote 

the site-specific chromosomal arm exchange of the ACes into, and out of, 
a multiplicity of the heterologous recombination sites integrated within the 
cell's endogenous chromosomal DNA; 

c) isolating said multiplicity of ACes, thereby producing a library of 
20 ACes whereby a multiplicity of ACes have mutually exclusive portions of 

the endogenous chromosomal DNA therein; 

d) introducing the isolated multiplicity of ACes of step c) into a 
multiplicity of cells, thereby creating a library of cells; 

e) selecting different cells having mutually exclusive ACes therein 
25 and clonally expanding or differentiating said different cells into clonal cell 

cultures, thereby creating one or more cell lines. 

These ACes, cell lines and methods utilize the site-specific 
recombination sites on platform ACes analogous YAC manipulation 
related to: the methods of generating terminal deletions in normal and 
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artificial chromosomes (e.g., ACes; as described in Vollrath et al., 1988, 
PNAS, USA, 85:6027-66031; and Pavan et al., PNAS, USA, 87:1300- 
1304); the methods of generating interstitial deletions in normal and artificial 
chromosomes (as described in Campbell et al., 1991, PNAS, USA, 
5 888:5744-5748); and the methods of detecting homologous recombination 
between two ACes (as described in Cellini et al., 1991, Nuc. Acid Res., 
19(5):997-1000). 

5. Use of plateform ACes in Pharmacogenomic/toxicology 
applications (development of "Reporter ACes") 

10 In addition to the placement of genes onto ACes chromosomes for 

therapeutic protein production or gene therapy, the platform can be 
engineered via the IntR lambda integrase to carry reporter-linked constructs 
(reporter genes) that monitor changes in cellular physiology as measured by 
the particular reporter gene (or a series of different reporter genes) readout. 

15 The reporter linked constructs are designed to include a gene that can be 
detected (by for example fluorescence, drug resistance, 
immunohistochemistry, or transcript production, and the like) with well- 
known regulatory sequences that would control the expression of the 
detectable gene. Exemplary regulatory promoter sequences are well-known 

20 in the art. 

A) Reporter ACes for drug pathway screening 

The ACes can be engineered to carry reporter-linked constructs that 
indicate a signal is being transduced through one or a number of pathways. 
For example, transcriptionally regulated promoters from genes at the end (or 
25 any other chosen point) of particular signal transduction pathways could be 
engineered on the ACes to express the appropriate readout (either by 
fluorescent protein production or drug resistance) when the pathway is 
activated (or down-regulated as well). In one embodiment, a number of 
reporters from different pathways can be placed on an 
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ACes chromosome. Cells (and/or whole animals) containing such a 
Reporter ACes could be exposed to a variety of drugs or compounds and 
monitored for the effects of the drugs or compounds upon the selected 
pathway(s) by the reporter gene(s). Thus, drugs or compounds can be 
5 classified or identified by particular pathways they excite or down- 
regulate. Similarly, transcriptional profiles obtained from genomic array 
experiments can be biologically validated using the reporter ACes 
provided herein. 

B) Reporter ACes for toxic compound testing 

10 Environmental or man-made genotoxicants can be tested in cell 

lines carrying a number of reporter-genes platform ACes linked to 
promoters that are transcriptionally regulated in response to DNA damage, 
induced apoptosis or necrosis, and cell-cycle perturbations. Furthermore, 
new drugs and/or compounds could be tested in a similar manner with the 

15 genotoxicant ACes reporter for their cellular/genetic toxicity by such a 
screen. Likewise, toxic compound testing could be carried out in whole 
transgenic animals carrying the ACes chromosome that measures 
genotoxicant exposure ("canary in a coal mine"). Thus, the same or 
similar type ACes could be used for toxicity testing in either a cell-based 

20 or whole animal setting. An example would include ACes that carry 
reporter-linked genes controlled by various cytochrome P450 profiled 
promoters and the like. 

C) Reporter ACes for individualized pharmacogenomics/drug 

profiling 

25 A common disease may arise via various mechanisms. In many 

instances there are multiple treatments available for a given disease. 
However, the success of a given treatment may depend upon the 
mechanism by which the disease originated and/or by the genetic 
background of the patient. In order to establish the most effective 
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treatment for a given patient one could utilize the ACes reporters provided 
herein. ACes reporters can be used in patient cell samples to determine 
an individualized drug regimen for the patient. In addition, potential 
polymorphisms affecting the transcriptional regulation of an individual's 
5 particular gene can be assessed by this approach. 

D) Reporter ACes for classification of similar patient tumors 
As with other diseases as described in 5.C) above, cancer cells 

arise via different mechanisms. Furthermore, as a cancerous cell 
propagates it may undergo genomic alterations. An ACes reporter 

10 transferred to cells of different patients having the same disease, i.e. 

similar cancers, could be used to categorize the particular cancer of each 
patient, thereby facilitating the identification of the most effective 
therapeutic regimen. Examples would include the validation of array 
profiling of certain classes of breast cancers. Subsequently, appropriate * 

15 drug profiling could be carried out as described above. 

E) Reporter ACes as a "differentiation" sensor 

Using the ACes reporter as a "differentiation" sensor in stem cells 
or other progenitor cells in order to enrich by selection (either FACS based 
screening, drug selection and/or use of suicide gene) for a particular class 
20 of differentiated or undifferentiated cells. For example, in one 

embodiment, this assay could also be used for compound screening for 
small molecule modifiers of cell differentiation. 

F) Whole animal studies with Reporter ACes 

Finally, with whole-body fluorescence imaging technology (Yang et 
25 al. (2000) PNAS 97:12278) any of the above Reporter ACes methods 

could be used in conjunction with whole-body imaging to monitor reporter 
genes within whole animals without sacrificing the animal. This would 
allow temporal and spatial analysis of expression patterns under a given 
set of conditions. The conditions tested may include for example, normal 
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differentiation of a stem cell, response to drug or compound treatment 
whether targeted to the diseased tissue or presented systemically, 
response to genotoxicants, and the like. 

The following examples are included for illustrative purposes only 
5 and are not intended to limit the scope of the invention. 

EXAMPLE 1 

pFK161 

Cosmid pFK161 (SEQ ID NO: 118) was obtained from Dr. Gyula 
Hadlaczky and contains a 9 kb Not\ insert derived from a murine rDNA 

10 repeat (see clone 161 described in PCT Application Publication No. 

WO97/40183 by Hadlaczky et al. for a description of this cosmid). This 
cosmid, referred to as clone 161 contains sequence corresponding to 
nucleotides 10,232-15,000 in SEQ ID NO. 26. It was produced by 
inserting fragments of the megachromosome (see, U.S. Patent No. 

15 6,077,697 and International PCT application No. WO 97/40183). For 
example, H1D3, which was deposited at the European Collection of 
Animal Cell Culture (ECACC) under Accession No. 96040929, is a 
mouse-hamster hybrid cell line carrying this megachromosome into 
plasmid pWE15 (Stratagene, La Jolia, California; SEQ ID No. 31) as 

20 follows. Half of a 100 //I low melting point agarose block (mega-plug) 
containing isolated SATACs was digested with Not\ overnight at 37 °C. 
Plasmid pWE15 was similarly digested with Not\ overnight. The mega- 
plug was then melted and mixed with the digested plasmid, ligation buffer 
and T4 DNA ligase. Ligation was conducted at 16°C overnight. Bacterial 

25 DH5a cells were transformed with the ligation product and transformed 
cells were plated onto LB/Amp plates. Fifteen to twenty colonies were 
grown on each plate for a total of 189 colonies. Plasmid DNA was 
isolated from colonies that survived growth on LB/Amp medium and 
analyzed by Southern blot hybridization for the presence of DNA that 
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hybridized to a pUC19 probe. This screening methodology assured that 
all clones, even clones lacking an insert but yet containing the pWE15 
plasmid, would be detected. 

Liquid cultures of all 1 89 transformants were used to generate 
5 cosmid minipreps for analysis of restriction sites within the insert DNA. 
Six of the original 1 89 cosmid clones contained an insert. These clones 
were designated as follows: 28 (~ 9-kb insert), 30 (~ 9-kb insert), 60 
(~4-kb insert), 113 (~9-kb insert), 157 {-9-kb insert) and 161 (-9-kb 
insert). Restriction enzyme analysis indicated that three of the clones 

lO (113, 157 and 161) contained the same insert. 

For sequence analysis the insert of cosmid clone no. 161 was 
subcloned as follows. To obtain the end fragments of the insert of clone 
no. 161, the clone was digested with Not\ and BamH\ and ligated with 
Not\/BamH\-d'\Qested pBluescript KS (Stratagene, La Jolla, California). 

15 Two fragments of the insert of clone no. 161 were obtained: a 0.2-kb and 
a 0.7-kb insert fragment. To subclone the internal fragment of the insert 
of clone no. 161, the same digest was ligated with £amHI-digested 
pUC19. Three fragments of the insert of clone no. 161 were obtained: a 
0.6-kb, a 1.8-kb and a 4.8-kb insert fragment. 

20 The insert corresponds to an internal section of the mouse 

ribosomal RNA gene (rDNA) repeat unit between positions 7551-15670 
as set forth in GENBANK accession no. X82564, which is provided as 
SEQ ID NO. 18. The sequence data obtained for the insert of clone no. 
161 is set forth in SEQ ID NOS. 19-25. Specifically, the individual 

25 subclones corresponded to the following positions in GENBANK accession 
no. X82564 (SEQ ID NO: 18) and in SEQ ID NOs. 19-25: 
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Subclone 


Start 


End 


Site 


SEQ ID No. 




in X82564 






161k1 


7579 


7755 


Not\, BamH\ 


19 


161m5 


7756 


8494 


BamYU 


20 


161m7 


8495 


10231 


BamHX 


21 (shows only sequence corresponding 
to nt. 8495-8950), 

22 {shows only sequence corresponding 
to nt. 9851- 10231) 


161m12 


10232 


15000 


BamH\ 


23 (shows only sequence corresponding 
to nt. 1 0232-1 0600). 

24 (shows only sequence corresponding 
to nt. 14267-15000) 


161k2 


15001 


15676 


Not\, BamHX 


25 



The sequence set forth in SEQ ID NOs. 19-25 diverges in some 
10 positions from the sequence presented in positions 7551-15670 of 

GENBANK accession no- X82564. Such divergence may be attributable 
to random mutations between repeat units of rDNA. 

For use herein, the rDNA insert from the clone was prepared by 
digesting the cosmid with l\lot\ and Bgl\\ and was purified as described 
1 5 above. Growth and maintenance of bacterial stocks and purification of 

plasmids were performed using standard well known methods (see, e.g. , 
Sambrook et al. (1 989) Molecular Cloning: A Laboratory Manual, 2nd 
Edition, Cold Spring Harbor Laboratory Press), and plasmids were purified 
from bacterial cultures using Midi - and Maxi-preps Kits (Qiagen, 
20 Mississauga, Ontario). 
pDsRedlNI 

This vector is available from Clontech (see SEQ ID No. 29) and 
encodes the red fluorescent protein (DsRed; Genbank accession no. 
AF27271 1; SEQ ID Nos. 39 and 40). DsRed, which has a vivid red 
25 fluorescence, was isolated from the IndoPacific sea anemone relative 

Discosoma species. The plasmid pDsRedlNI (Clontech; SEQ ID No. 29) 
constitutively expresses a human codon-optimized variant of the 
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fluorescent protein under control of the CMV promoter. Unmodified, this 
vector expresses high levels of DsRedl and includes sites for creating N- 
terminal fusions by cloning proteins of interest into the multiple cloning 
site (MCS). It is Kan and Neo resistant for selection in bacterial or 
5 eukaryotic cells. 
Plasmid pMG 

Plasmid pMG (InvivoGen, San Diego, California; see SEQ. ID. NO. 
27 for the nucleotide sequence of pMG) contains the hygromycin 
phosphotransferase gene under the control of the immediate-early human 

10 cytomegalovirus (hCMV) enhancer/promoter with intron A. Vector pMG 
also contains two transcriptional units allowing for the coexpression of 
two heterologous genes from a single vector sequence. 

The first transcriptional unit of pMG contains a multiple cloning site 
for insertion of a gene of interest, the hygromycin phosphotransferase 

1 5 gene {hph) and the immediate-early human cytomegalovirus (hCMV) 

enhancer/promoter with intron A (see, e.g., Chapman eta/. (1991) Nuc. 
Acfds Res. 75:3979-3986) located upstream of hph and the multiple 
cloning site, which drives the expression of hph and any gene of interest 
inserted into the multiple cloning site as a polycistronic mRNA. The first 

20 transcriptional unit also contains a modified EMCV internal ribosomal 

entry site (IRES) upstream of the hph gene but downstream of the hCMV 
promoter and MCS for ribosomal entry in translation of the hph gene (see 
SEQ ID NO. 27, nucleotides 2736-3308). The IRES is modified by 
insertion of the constitutive E. coli promoter (EM7) within an intron (IM7) 

25 into the end of the IRES. In mammalian cells, the E. coli promoter is 
treated as an intron and is spliced out of the transcript. A 
polyadenylation signal from the bovine growth hormone (bGh) gene (see, 
e.g., Goodwin and Rottman (1992) J. Biol. Chem. 267:16330-16334) 
and a pause site derived from the 3' flanking region of the human cr2 
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globin gene (see, e.g., Enriquez-Harris eta/. (1991) EMBO J. 70: 1833- 
1842) are located at the end of the first transcription unit. Efficient 
polyadenylation is facilitated by inserting the flanking sequence of the 
bGh gene 3' to the standard AAUAAA hexanucleotide sequence. 
5 The second transcriptional unit of pMG contains another multiple 

cloning site for insertion of a gene of interest and an EF-1a/HTLV hybrid 
promoter located upstream of this multiple cloning site, which drives the 
expression of any gene of interest inserted into the multiple cloning site. 
The hybrid promoter is a modified human elongation factor-1 alpha (EF-1 

10 alpha) gene promoter (see, e.g., Kim eta/. (1990) Gene ,9 7:217-223) 
that includes the R segment and part of the U5 sequence (R-U5') of the 
human T-cell leukemia virus (HTLV) type I long terminal repeat (see, e.g., 
Takebe eta/. (1988) Mol. Cell. B/o/ 3:436-472) . The Simian Virus 40 
(SV40) late polyadenylation signal (see Carswell and Alwine (1989) Mo/. 

15 Ce/t. Bio/. 9:4248-4258) is located downstream of the multiple cloning 
site. Vector pMG contains a synthetic polyadenylation site for the first 
and second transcriptional units at the end of the transcriptional unit 
based on the rabbit /?-globin gene and containing the AATAAA 
hexanucleotide sequence and a GT/T-rich sequence with 22-23 

20 nucleotides between them (see, e.g., Levitt eta/. (1989) Genes Dev. 

3:1019-1025). A pause site derived from the C2 complement gene (see, 
Moreira eta/. (1995) EMBO J. 74:3809-3819) is also located at the 3' 
end of the second transcriptional unit. 

Vector pMG also contains an ori sequence (ori pMB1) located 

25 between the SV40 polyadenylation signal and the synthetic 
polyadenylation site. 

EXAMPLE 2 

A. Construction off targeting vector and transfection into LMtk- cells 
for the generation of platform chromosomes 
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A targeting vector derived from the vector pWE15 (GeneBank 
Accession # X65279) was modified by replacing the Sail (Klenow 
f\\\ed)/Sma\ neomycin resistance containing fragment with the 
PvuWIBamVW (Klenow filled) puromycin resistance containing fragment 
5 (isolated from plasmid pPUR, Clontech Laboratories, Inc. Palo Alto, CA; 
SEQ ID No. 30) resulting in plasmid pWEPuro. Subsequently a 9 Kb Not\ 
fragment from the plasmid pFK161 (SEQ ID NO: 1 18) containing a portion 
of the mouse rDNA region was cloned into the Not\ site of pWEPuro 
resulting in plasmid pWEPuro9K (Figure 2). The vector pWEPuro9K was 

10 digested with Spe\ to linearize and transfected into LMtk- mouse cells. 
Puromycin resistant colonies were isolated and subsequently tested for 
artificial chromosome formation via fluorescent in situ hybridization (FISH) 
(using mouse major and minor DNA repeat sequences, the puromycin 
gene and telomeres sequences as probes), and fluorescent activated cell 

15 sorting (FACS). From this sort, a subclone was isolated containing an 
artificial chromosome, designated 5B11.12, which carries 4-8 copies of 
the puromycin resistance gene contained on the pWEPuro9K vector. 
FISH analysis of the 5B1 1.12 subclone demonstrated the presence of 
telomeres and mouse minor on the ACes. DOT PCR has been done on 

20 the 5B1 1.12 ACes revealing the absence of uncharacterized euchromatic 
regions on the ACes. A recombination site, such as an att or ioxP 
engineering site or a plurality thereof, was introduced onto this ACes 
thereby providing a platform for site-specific introduction of heterologous 
nucleic acid. 

25 B. Targeting a single sequence specific recombination site onto 
platform chromosomes 

After the generation of the 5B11.12 platform, a single sequence- 
specific recombination site is placed onto the platform chromosome via 
homologous recombination. For this, DNA sequences containing the site- 
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specific recombination sequence can be flanked with DNA sequences of 
homology to the platform chromosome. For example, using the platform 
chromosome made from the pWEPuro9K vector, mouse rDNA sequences 
or mouse major satellite DNA can be used as homologous sequences to 
5 target onto the platform chromosome. A vector is designed to have these 
homologous sequences flanking the site-specific recombination site and, 
after the appropriate restriction enzyme digest to generate free ends of 
homology to the platform chromosome, the DNA is transfected into cells 
harboring the platform chromosome (Figure 3). Examples of site-specific 

10 cassettes that are targeted to the platform chromosome using either 
mouse rDNA or mouse major repeat DNA include the SV40-attP-hygro 
cassette and a red fluorescent protein (RFP) gene flanked by loxP sites 
(Cre/lox, see, e.g., U.S. Patent No. 4,959,317 and description herein). 
After transfection and integration of the site-specific cassette, 

1 5 homologous recombination events onto the platform chromosome are 

subcloned and identified by FACS (e.g. screen and single cell subclone via 
expression of resistance or fluorescent marker) and PCR analysis. 

For example, a vector can be constructed containing regions of the 
mouse rDNA locus flanking a gene cassette containing the SV40 early 

20 reporter-bacteriophage lambda attP site-hygromycin selectable marker 

(see Figure 4 and described below). The use of the bacteriophage lambda 
attP site for lambda integrase-mediated site-specific recombination is 
described below. Homologous recombination event of the SV40-attP- 
hygro cassette onto the platform chromosome was identified using PCR 

25 primers that detect the homologous recombination and further confirmed 
by FISH analysis. After identifying subcloned colonies containing the 
platform chromosome with a single site-specific recombination site, cells 
carrying the platform chromosome with a single site-specific 
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recombination site can now be engineered with site-specific recombinases 
(e.g. lambda INT, Cre) for integrating a target gene expression vector. 

C. Targeting a red fluorescent protein (RFP) gene flanked by loxP sites 
5 onto 5B 1 1 . 1 2 platform 

As another example, while loxP recombination sites could have 
been introduced onto the ACes during de novo biosynthesis, it was 
thought that this might result in multiple segments of the ACes containing 
a high number of loxP sites, potentially leading to instability upon Cre- 

10 mediated recombination. A gene targeting approach was therefore 

devised to introduce a more limited number of loxP recombination sites 
into a locus of the 5B1 1-12 ACes containing introduced and possibly co- 
amplified endogenous rDNA sequences. Although there are more than 
200 copies of rDNA genes in the haploid mouse genome distributed 

15 amongst 5-11 chromosomes (depending on strain), rDNA sequences were 
chosen as the target on the ACes since they represent a less frequent 
target than that of the satellite repeat sequences. Moreover, having 
observed much stronger pWEPuro9K hybridization to the 5B1 1-12 ACes 
than to other LMTK" chromosomes and in light of the observation that the 

20 transcribed spacer sequences within the rDNA may be less conserved 

than the rRNA coding regions, it was contemplated that a targeting vector 
based on the rDNA gene segment in pWEPuro9K would have a higher 
probability of targeting to the ACes rather than to other LMTK" 
chromosomes. Accordingly, a targeting vector, pBSFKLoxDsRedLox, was 

25 designed and constructed based on the rDNA sequences contained in 
pWEPuro9K. 

The plasmid pBSFKLoxDsRedLox was generated in 4 steps. First, 
the Not\ rDNA insert of pWEPuro9K (Figure 2) was inserted into pBS SK- 
(Stratagene) giving rise to pBSFK. Second, a loxP polylinker cassette was 
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generated by PCR amplification of pNEB193 (SEQ ID NO:32; New 
England Biolabs) using primers complementary to the Ml 3 forward and 
reverse priming sites at their 3'end and a 34 bp 5' extension comprising a 
LoxP site. This cassette was reinserted into pNEB193 generating 
5 p193LoxMCSLox. Third, the DsRed gene from pDsRed1-N1 (SEQ ID 
NO:29; Clontech) was then cloned into the polylinker between the loxP 
sites generating p1 93LoxDsRedLox. Fourth, a fragment consisting of the 
DsRed gene flanked by loxP sites was cloned into a unique Nde\ within 
the rDNA insert of pBSFK generating pBSFKLoxDsRedLox. 

10 A gel purified 1 1 Kb Pml\ /EcoRV fragment of pBSFKLoxDsRedLox 

was used for transfection. To detect targeted integration, PCR primers 
were designed from rDNA sequences within the 5' Not\-Pml\ fragment of 
pWEPuro9K that is not present on the targeting fragment (5'primer) and 
sequence within the LoxDsRedLox cassette (3' primer). If the targeting 

15 DNA integrated correctly within the rDNA sequences, PCR amplification 
using these primers would give rise to a 2.3 Kb band. PCR reactions 
containing 1-4 jj\ of genomic DNA were carried out according to the 
MasterTaq protocol (Eppendorf), using murine rDNA 5' primer (5'- 
CGGACAATGCGGTTGTGCGT-3'; SEQ ID NO:72) and DsRed 3'primer 

20 (5'GGCCCCGTA ATGCAGAAG AA-3' ; SEQ ID NO:73) and PCR products 
were analyzed by agarose gel electrophoresis. 

1 .5X10 6 5B1 1-12 LMTK cells were transfected with 2 //g of the 
pBSFKLoxDsRedLox targeting DNA described above using Lipofectamine 
Plus (Invitrogen). For flow sorting, harvested cells were suspended in 

25 medium and applied to the Becton Dickinson Vantage SE cell sorter, 

equipped with 488 nm lasers for excitation and 585/42 bandpass filter for 
optimum detection of RFP fluorescence. Cells were sorted using dPBS as 
sheath buffer. Negative control parental 5B1 1-1 2 cells and a positive 
control LMTK cell line stably transfected with DsRed were used to 
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establish the selection gates. The RFP positive gated populations were 
recovered, diluted in medium supplemented with 1X penicillin- 
streptomycin (Invitrogen), then plated and cultured as previously 
described. After 4 rounds of enrichment, the percentage of RFP positive 
5 cells reached levels of 50% or higher. DNA from populations was 

analyzed by PCR for evidence of targeted integration. Ultimately, single 
cell subclones were established from positive pools and were analyzed by 
PCR and PCR-positive clones confirmed by FISH as described below. 
DMA was purified from pools or single cell clones using previously 

10 described methods set forth in Lahm et al.. Transgenic Res. , 1998; 

7:131-134, or in some cases using a Wizard Genomic DNA purification kit 
(Promega). For FISH analysis, a biotinylated DsRed gene probe was 
generated by PCR using DsRed specific primers and biotin-labeled dUTP 
(5' RFP primer: 5'-GGTTTAAAGTGCGCTCCTCCAAGAACGTCATC-3', 

15 SEQ ID NO:74; and 3' RFP primer: 

5'AGATCTAGAGCCGCCGCTACAGGAACAGGTGGTGGCGGCC-3'; SEQ 
ID NO:75). To maximize the signal intensity of the DsRed probe, 
Tyramide amplification was carried out according to the manufacturers 
protocols (NEN). 

20 The process of testing the feasibility of a more general targeting 

strategy that would not rely on enrichment via drug selection of stably 
transfected clones can be summarized as follows. A red fluorescent 
protein gene (RFP; encoded by the DsRed gene) was inserted between the 
loxP sites of the targeting vector to form pBSFKLoxDsRedLox. After 

25 transfection with PBSFKLoxDsRedLox, sequential rounds of high speed 

flow sorting and expansion of sorted cells in culture could then be used to 
enrich for stable transformants expressing RFP- In the event of targeted 
integration, PCR screening with primers that amplify from a spacer region 
within the segment of the 45s pre-rRNA gene in pWEPuro9K to a specific 
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anchor sequence within the DsRed gene in the targeting cassette would 
give rise to a diagnostic 2.3 Kb band. However, as rDNA clusters are 
found on several chromosomes, confirmation of targeting to an ACes 
would require fluorescence in situ hybridization (FISH) analysis. Finally, 
5 the flanking of the DsRed gene by loxP sites would allow for its removal 
and subsequent replacement with other genes of interest. 

After transfection of the targeting sequence into 5B1 1-12 cells, 
enrichment for targeted clones was carried out using a combination of 
flow cytometry to detect red-fluorescing cells and PCR screening. 

10 Ultimately 17 single cell subclones were identified as potential targeted 

clones by PCR and of these 16 were found by FISH to contain the DsRed 
integration event into the ACes. These subclones are referred to herein 
as D11-C4, D11-C12, D11-H3, C9-C9, C9-B9, C9-F4, C9-H8, C9-F2, C9- 
G8, C9-B6, C9-G3, C9-E12, C9-A11, C11-E3, C11-A9and C11-H4. PCR 

1 5 analysis of genomic DNA isolated from the D1 1-C4 subclone gave rise to 
a 2.3 Kb band, indicative of a targeted integration into an rDNA locus. 
Further analysis of the subclone by FISH analysis with a DsRed gene 
probe demonstrated integration of the LoxDsRedLox targeting cassette on 
the ACes co-localizing with one of the regions of rDNA staining seen on 

20 the 5B11-12 ACes, consistent with a targeted integration into an rDNA 
locus of the ACes, while integrations on other chromosomes were not 
observed. Since transfected cells were maintained as heterogeneous 
populations through several cycles of sorting and replating it was not 
possible to estimate the frequency of targeted events. In most 

25 mammalian eel! lines the frequency of gene targeting via homologous 
recombination is roughly 1Cr 5 -1CT 7 treated cells. Despite the low 
frequency of these events in mammalian cells, it is clear that an RFP 
expression based screening paradigm, coupled with PCR analysis, can 
effectively detect and enrich for such infrequent events in a large 
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population. In instances where drug selection is not possible or not 
desirable, such a system may provide a useful alternative. It was also 
verified that the modified ACes in subclone D1 1-C4 could be purified by 
flow cytometry- The results indicate that the flow karyogram of the D1 1- 
5 C4 subclone was unaltered from that of the 5B1 1-12 cell line. Thus, the 
D1 1-C4 ACes can be purified in high yield from native chromosomes of 
the host cell line. 

D. Reduction of LoxP on ACes to a single site. 

10 The strong hybridization signal detected by FISH on the ACes using 

the DsRed gene probe suggests that several copies of the targeting 
cassette may be present on the ACes in the D1 1-C4 line. This also 
suggests that multiple rDNA genes have been correctly targeted. 

Accordingly, in certain embodiments where necessary, the number 

15 of loxP sites on the ACes can be reduced to a single site by in situ 

treatment with Cre recombinase, provided that the sites are co-linear. 
Such a process is described for multiple loxP-flanked integrations on a 
native mouse chromosome (Garrick et al., Nature Genet. . 1998, 
Jan;18(1):56-59). Reduction to a single loxP site on the D11-C4>4Ces 

20 would result in the loss of the DsRed gene, forming the basis of a useful 
screen for this event. 

For this purpose, a Cre expression plasmid pCX-Cre/GFP III has 
been generated by first deleting the EcoRI fragment of pCX-eGFP (SEQ ID 
NO:71) containing the eGFP coding sequence and replacing it with that of 

25 a PCR amplified Cre recombinase coding sequence (SEQ ID N0:58), 
generating pCX-Cre. Next, the Asel/Sspl fragment of pD2eGFP-N1 
(containing the CMV promoter driving the D2EGFP gene with SV40 polyA 
signal; Clontech; SEQ ID NO:87) was inserted into the filled Hindlll site of 
pCX-Cre, generating pCX-Cre\GFP III. Control plasmid pCX-CreRev\GFP 
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III was generated in similar fashion except that the Cre recombinase 
coding sequence was inserted in the antisense orientation. LMTK" cell 
line D1 1-C4 (containing first generation platform ACes with multiple loxP- 
DsRED sites) and 5B1 1-12 cell line (containing ACes with no loxP-DsRED 
5 sites) are maintained in culture as described above. D11C4 cells are 
transfected with 2 jjg of plasmid pCX-Cre\GFP III or 2 jjq pCX- 
CreRev\GFP III using Lipofectamine (Invitrogen) as previously described. 

Forty-eight to seventy-two hours after transfection, transfected 
D11-C4 cells are harvested and GFP positive, cells are sorted by cell 

10 cytometry using a FACSta Vantage cell sorter (Beckton-Dickinson) as 
follows: All D1 1-C4 cells transfected with pCX-Cre\GFP III or control 
plasmid pCX-CreRev\GFP III that exhibit GFP fluorescent higher than the 
gate level established by untransfected cells are collected and placed in 
culture a further 7-14 days. After 7-14 days the initial D1 1-C4 cells are 

15 harvested and analyzed by cell cytometry as follows: Untransfected D1 1- 
C4 cells are used to establish the gate that defines the RFP positive 
population, while 5B1 1-12 cells are used to set the RFP negative gate. 
The GFP positive population of D1 1-C4 transfected with pCX-Cre\GFP III 
should show decreased red fluorescence compared to pCX-CreRev\GFP III 

20 transfected or untransfected control D1 1-C4 cells. The cells exhibiting 
greatly decreased or no RFP expression are collected and single cell 
clones subsequently established. These clones will be expanded and 
analyzed by fluorescence in-situ hybridization and Southern blotting to 
confirm the removal of loxP-DsRed gene copies. 

25 

EXAMPLE 3 

Construction of targeting vector and transfection into LMtk- cells for the 
generation of platform chromosomes containing multiple site-specific 
recombination sites 
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An example of a selectable marker system for the creation of a 
chromosome-based platform is shown in Figure 4. This system includes a 
vector containing the SV40 early promoter immediately followed by (1) a 
282 base pair (bp) sequence containing the bacteriophage lambda attP 
5 site and (2) the puromycin resistance marker. Initially a Pvu\\IStu\ 
fragment containing the SV40 early promoter from plasmid pPUR 
(Clontech Laboratories, Inc., Palo Alto, CA; Seq ID No. 30) was 
subcloned into the EcoR\ICH\ site of pNEB193 (a PUC19 derivative 
obtained from New England Biolabs, Beverly, MA; SEQ ID No. 32) 
10 generating the plasmid pSV40193. The only differences between pUC19 
and pNEB1 93 are in the polylinker region. A unique AscI site 
(GGCGCGCC) is located between the BamH\ site and the Smal site, a 
unique Pad site (TTAATTAA) is located between the BamH\ site and the 
Xbal site and a unique Pmel site (GTTTAAAC) is located between the Pst\ 
15 site and the Sa/\ site. 

The attP site was PCR amplified from lambda genome (GenBank 
Accession # NC 001416) using the following primers: 

attPUP: CCTTGCGCTAATGCTCTGTTACAGG SEQ ID No. 1 
attPDWN: CAGAGGCAGGGAGTGGGACAAAATTG SEQ ID No. 2 
20 After amplification and purification of the resulting fragment, the 

attP site was cloned into the Smal site of pSV40193 and the orientation 
of the attP site was determined by DNA sequence analysis (plasmid 
pSV401 93attP). The gene encoding puromycin resistance (Puro) was 
isolated by digesting the plasmid pPUR (Clontech Laboratories, Inc. Palo 
25 Alto, CA) with Age\IBamH\ followed by filling in the overhangs with 

Klenow and subsequently cloned into the Asc\ site downstream of the 
attP site of pSV40193attP generating the plasmid 
pSV401 93attPsensePUR (Figure 4; SEQ ID NO:1 13)). 
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The plasmid pSV401 93attPsensePUR was digested with Seal and 
co-transfected with the plasmid pFK161 (SEQ ID NO: 118) into mouse 
LMtk- ceils and platform artificial chromosomes were identified and 
isolated as described above. The process for generating this exemplary 
5 platform ACes containing multiple site-specific recombination sites is 
summarized in Figure 5. One platform ACes resulting from this 
experiment is designated B19-18. This platform ACes chromosome may 
subsequently be engineered to contain target gene expression nucleic 
acids using the lambda integrase mediated site-specific recombination 
10 system as described herein in Example 7 and 8. 

EXAMPLE 4 

Lambda Integrase mediated site-specific recombination of a RFP 
expressing vector onto artificial chromosomes 

In this example, a vector expressing the red fluorescent protein 

15 (RFP) was produced and recombined into the attP site residing on an 

artificial chromosome within LMTK- cells. This recombination is depicted 

in Figure 7. 

A. Construction of expression vectors containing wildtype and 
mutant lambda integrase 

20 Mutations at the glutamic acid at position 1 74 in the lambda 

integrase protein relaxes the requirement for the accessory protein IHF 

during recombination and DNA supercoiling in vitro (see, Miller et at. 

(1980) Ceff 20:721-729; Lange-Gustafson et aL (1984) J. Bio/, Chem. 

259:12724-1 2732). Mutations at this site promote attP, attB 
25 intramolecular recombination in mammalian cells (Lorbach et ai. (2000) J. 

MoL Bioi 296:1 175-1 181). 

To construct nucleic acid encoding the mutant, lambda integrase 

was PCR amplified from bacteriophage lambda DNA (cl857 \nd Sam 7; 

New England Biolabs) using the following primers: 
30 Lamintl (SEQ ID No. 3) 
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TTCGAATTCATGGGAAGAAGGCGAAGTCATGAGCG) 
Lamint2 (SEQ ID No. 4) 

(TTCGAATTCTTATTTGATTTCAATTTTGTCCCAC). 

The resulting PCR product was digested with EcoR I and cloned into the 
5 EcoR I site of pUC19. Lambda integrase was mutated at amino acid 
position 174 using QuikChange Site-Directed Mutagenesis Kit 
(Stratagene) and the following oligos (generating a glutamic acid to 
arginine change at position 174): 
LambdalNTE174R 
10 (SEQ ID No. 6) 

(CGCGCAGCAAAATCTAGAGTAAGGAGATCAAGACTTACGGCTGACG), 

LamintR174rev (SEQ ID No. 7) 

(CGTCAGCCGTAAGTCTTGATCTCCTTACTCTAGATTTTGCTGCGCG). 
The resulting site directed mutant was confirmed by sequence analysis. 

15 The wildtype and mutant lambda genes were cloned into the EcoR I site 
of pCX creating pCX-Lamlnt (SEQ ID NO: 127) and pCXLamlntR (Figure 
8; SEQ ID NO: 112). 

The plasmid pCX (SEQ ID No. 70) was derived from plasmid 
pCXeGFP (SEQ ID No. 71). Excision of the EcoRI fragment containing the 

20 eGFP marker generated pCX. To generate plasmid pCXLamlNTR (SEQ ID 
NO: 1 12) an EcoRI fragment containing the lambda integrase E174R (SEQ 
ID No. 37) mutation was cloned into the EcoRI site of pCX, and to 
generate plasmid pCX-LamINT, an EcoRI fragment containing the wild- 
type lambda integrase was cloned into the EcoRI site of pCX. 

25 B. Construction of integration vector containing attB and DsRed 

The plasmid pDsRedNI (Clontech Laboratories, Palo Alto, CA; SEQ 
ID No. 29) was digested with Hpa I and ligated to the following annealed 
oligos: 

attB1 (SEQ ID No. 8) 
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(TGAAGCCTGCTTTTTTATACTAACTTGAGCGAA) 
attB2 (SEQ ID No. 9) 

(TTCGCTCAAGTTAGTATAAAAAAGCAGGCTTCA) 
The resulting vector (pDsRedN1-attB) was confirmed by PCR and 
5 sequence analysis. 

C- Transfection into LMtk- cells 

LM(tk-) cells containing the Prototype A ACes (LI -18; Chromos 
Molecular Systems Inc., Burnaby, BC Canada) were co-transfected with 
pDsRedNI or pDsRedNI -attB and either pCXLamlnt (SEQ ID NO: 127) or 
10 pCXLamlntR (SEQ ID NO: 1 12) using Lipofectarnine Plus Reagent 

(LifeTechnologies, Gaithersburg, MD). The transfected cells were grown 
in DMEM (LifeTechnologies, Gaithersburg, MD) with 10% FBS (CanSera) 
and G418 (CalBiochem) at a concentration of 1 mg/ml. 

D. Enrichment by cell sorting 

15 The transfected cells were sorted using a FACs Vantage SE cell 

sorter (Becton Dickenson) to enrich for cells expressing DsRed. The cells 
were excited with a 488 nm Argon laser at 200 watts and cells 
fluorescing in the 585/42 detection channel were collected. The sorted 
cells were returned to growth medium for recovery and expansion. After 

20 three successive enrichments for cells expressing DsRed, single cell 

sorting into 96 well plates was performed using the same parameters. 
Duplicate plates of the single cell clones were made for PCR analysis. 

E. PCR analysis of single cell clones 

Pools of cells from each row and column of the 96 well plate were 
25 used for DNA isolation. DNA was prepared using a Wizard Genomic DNA 
purification kit (Promega Inc, Madison, Wl). Nested PCR analysis on the 
DNA pools was performed to confirm the site-specific recombination 
event using the following primer sets: 
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attPdwr»2 (SEQ ID No. 10) 
(TCTTCTCGGGCATAAGTCGGACACC) 

CMVen (SEQ ID No. 1 1) 
(CTCACGGGGATTTCCAAGTCTCCAC) 

5 followed by: 

attPdwn (SEQ ID No. 12) 
(CAGAGGCAGGGAGTGGGACAAAATTG) 

CMVen2 (SEQ ID No. 13) 
(CAACTCCGCCCCATTG ACGCAAATG) . 

10 The resulting PCR reactions were analyzed by gel electrophoresis and the 
potential individual clones containing the site-specific recombination event 
were identified by combining the PCR results of all of the pooled rows 
and columns for each 96 well plate. The individual clones were then 
further analyzed by PCR using the following primers that flank the 

1 5 recombination junction. L1for and F1rev flank the attR junction whereas 

REDfor and L2rev flank the attL junction (see Figure 7): 

LI for (SEQ ID No. 14) 
AGTATCGCCGAACGATTAGCTCTTCA 

F1rev (SEQ ID No. 15) 
20 GCCGATTTCGGCCTATTGGTTAAA 

REDfor (SEQ ID No. 16) 
CCGCCGACATCCCCGACTACAAGAA 

L2rev (SEQ ID No. 17) 

TTC CTTC G A A G G G G ATC C G C CT A C C . 

25 F. Sequence analysis of recombination junctions 

PCR products spanning the recombination junction were Topo- 
cloned into pcDNA3.1 D/V5His (Invitrogen Inc., San Diego, CA) and then 
sequenced by cycle-sequencing. The clones were confirmed to have the 
correct attR and attL junctions by cycle sequencing. 
30 G. Fluorescent In Situ Hybridization (FISH) 

The cell lines containing the correct recombination junction 
sequence were further analyzed by fluorescent in situ hybridization (FISH) 
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by probing with the DsRed coding region labeled with biotin and 
visualizing with the Tyramide Signal Amplification system (TSA; NEN Life 
Science Products). The results indicate that the RFP sequence is present 
on the ACes. 
5 H. Southern analysis 

Genomic DNA was harvested from the cell lines containing an 
ACes with the correct recombinant event and digested with EcoR I. The 
digested DMAs were separated on a 0.7% agarose gel, transferred and 
fixed to a nylon membrane and probed with RFP coding sequences. The 
10 result showed that there is an integrated copy of RFP coding sequence in 
each clone. 

EXAMPLE 5 

Delivery of a second gene encoding GFP onto the RFP platform ACes 

A. Construction of integration vector containing attB and GFP 
1 5 (pD2eGFPIresPuroattB). 

The plasmid plRESpuro2 (Clontech, Palo Alto, CA; SEQ ID NO: 88) 
was digested with £"coRI and Not\ then ligated to the D2eGFP EcoR\-Mot\ 
fragment from pD2eGFP-N1 (Clontech, Palo Alto, CA) to create 
pD2eGFPIresPuro2. Subsequently, oligos encoding the attB site were 
20 annealed and ligated into the Nru\ site of pD2eGFPIresPuro2 to create 
pD2eGFPIresPuroattB. The orientation of attB in the Nru\ site was 
determined by PCR. 

B. Transfection of LMtk- cells 

The LMtk- cells containing the RFP platform A Ces produced in 
25 Example 4, which has multiple attP sites, were co-transfected with 
pCXLamlntR and pD2eGFPIresPuroattB using LipofectAMINE PLUS 
reagent. Five jjg of each vector was placed into a tube containing 750 //I 
of DMEM (Dulbecco's modified Eagles Medium). Twenty jj\ of the Plus 
reagent was added to the DNA and incubated at room temperature for 15 
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minutes. A mixture of 30 ju\ of lipofectamine and 750 jj\ DMEM was 
added to the DNA mixture and incubated an additional 15 minutes at 
room temperature. The DNA mixture was then added dropwise to 
approximately 3 million cells attached to a 10cm dish in 5 mis of DMEM. 
5 The cells were incubated 4 hours (37°C, 5% C0 2 ) with the DNA-lipid 
mixture, after which DMEM with 20% fetal bovine serum was added to 
the dishes to bring the culture medium to 10% fetal bovine serum. The 
dishes were incubated at 37°C with 5% C0 2 . 

Plasmid pD2eGFPIresPuroattB has a puromycin gene 
10 transcriptionally linked to the GFP gene via an IRES element. Two days 
after the transfection the cells were placed in medium containing 
puromycin at 4//g/ml to select for cells containing the 

pD2eGFPIresPuroattB plasmid integrated into the genome. Twenty-three 
clones were isolated after 17 days of selection with puromycin. These 
15 clones were expanded and then analyzed for the presence of the GFP 
gene on the ACes by 2-color (RFP/biotin & GFP/digoxigenin) TSA-FISH 
(NEN) according to the manufacturers protocol. Sixteen of the 23 clones 
produced a positive FISH signal on the ACes with a GFP probe. 

EXAMPLE 6 

20 Delivery Of ACes Into human Mesenchymal Stem Cells (hMSC) 
A. Transfection 

Transfection conditions for the most efficient delivery of the ACes 
into hMSCs (Cambrex BioWhittaker Product Code PT-2501, lot# F0658, 
East Rutherford, New Jersey) were assayed using LipofectAMINE PLUS 
25 and Superfect. One million prototype B ACes, which is a murine derived 
60Mb ACes having primarily murine pericentric heterochrorjiatin, and 
carrying a "payioad" containing a hygromycin B selectable marker gene 
and a JacZ reporter gene (see , Telenius et al. f 1999, Chrom. Res. , 7:3-7; 
and Kereso et al., 1996, Chrom, Res. , 4:226-239; each of which is 
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incorporated herein by reference in its entirety), were combined with 1-12 
jj\ of the transfection agent. In the case of LipofectAMINE PLUS, the 
PLUS reagent was combined with the ACes for 1 5 minutes followed by 
LipofectAMINE for a further 15 minutes. Superfect was complexed for 
5 10 minutes at a ratio of 2//I Superfect per 1 million ACes. The 

/ACes/transfection agent complex was then applied to 0.5 million recipient 
cells and the transfection was allowed to proceed according to the 
manufacturer's protocol. Percent transfected cells was determined on a 
FACS Vantage flow cytometer with argon laser tuned to 488 nm at 

10 200mW and FITC fluorescence collected through a standard FITC 530/30 
nm band pass filter. After 24 hours, IdUrd labeled ACes were delivered 
to human MSCs in the range of 30-50%, varying with transfection agent 
and dose. ACes delivery curves were generated from data collected in 
experiments that varyied the dose of the transfection reagents. Dose 

15 response curves of Superfect and LipofectAMINE PLUS, showing delivery 
of ACes into recipient hMSCs cells, were prepared, measured by transfer 
of IdUrd labeled ACes and detected by flow cytometry. Superfect shows 
maximum delivery in the range of 30-50% at doses greater than 2 y\ per 
million ACes. LipofectAMINE PLUS has a 42-48% delivery peak around 

20 5-8 jj\ per million ACes. These dose curves were then correlated with 
toxicity data to determine the transfection conditions that will allow for 
highest potential transfection efficiency. Toxicity was determined by a 
modified plating efficiency assay (de Jong et al., 2001, Chrom. Research, 
9:475-485). The population's normalized plating efficiency (at maximum 

25 % delivery doses) was in the range of 0.2 - 0.4 for Superfect and 0.5 - 
0.6 with LipofectAMINE PLUS. 

Due to the transfected population consisting of mixed cell types, 
flow cytometry allowed for the assessment of ACes delivery into each 
sub-population and the purification of the target population. Flow profiles 




-115- 

showing forward scatter (cell size) and side scatter (internal cell 
granularity) revealed three distinct hMSC populations that were gated into 
three regions: R3 (small cell region), R4 (medium cell region), R5 (large 
cell region). Transfection conditions were further optimized by re- 
5 analyzing delivery curves and assessing the differences in delivery to each 
sub-population. Dose response curves of Superfect and LipofectAMINE 
were prepared showing % delivery to each sub-population represented by 
the gating on basis of cell size and granularity properties of the mixed 
population. Three distinct hMSC populations were gated and % delivery 

10 dose curves generated. Using Superfect and LipofectAMINE PLUS the 

overall % delivery increased with cell size (8090% delivery in large cells). 
LipofectAMINE PLUS at high doses (8-1 2 fj\ per 1 million ACes) shows an 
increase in the overall proportion of chromosome transfer to the small 
population (10-20%). This suggests an advantage to using this 

1 5 transfection agent if the small-undifferentiated cell population is the 
desired target host cell. 

B. Expression from Genes on ACes IN hMSCs 
Following the delivery screening process conducted in section (A) 
above, the most promising results were subjected to further analyses to 

20 monitor expression and verify the presence of structurally intact ACes. 

The transfection conditions employed for these experiments were exactly * 
the same as those that had been used during the screening process. 
Short-term expression was monitored by transfecting hMSCs with ACes 
containing a RFP gene (red fluorescent protein) set forth in Example 2C as 

25 "D1 1C4". The unselected population was harvested at 72-96 hours post 
transfection and % positive fluorescent cells measured by flow 
cytometry. RFP expression was in the range of 1-20%. 

Long term-gene expression was assayed by selecting for 
hygromycin B resistant cells over a period of 7-1 O days. Cytogenetic 
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analysis was done to detect presence of intact ACes by Fluorescent In 
Situ hybridization (FISH), where metaphase chromosomes were hybridized 
to a mouse major satellite-DNA probe (targeting murine pericentric 
heterochromatin) and a lambda probe (hybridizing to the /acZ gene). The 
5 human mesenchymal transfected culture could not undergo standard sub- 
cloning as diffuse colonies form with limited doublings available for 
expansion. Cytogenetic analysis was performed on the entire population, 
sampling over a period of 3-1 0 days post-transfection. The hygromycin 
resistant population was then blocked in mitosis with colchicine and 

10 analyzed for presence of intact ACes by FISH. Preliminary FISH results 
show approximately 2-8% of the hMSC-transfected population had an 
intact ACes. This compared to rat skeletal muscle myoblast clones, 
which were in the range of 60-95%. To increase the % of intact ACes in 
the hMSC-transfected population an enrichment step can be utilized as 

15 described in Example 2C. 

C. Differentiation of The hMSCs 

In initial experiments where transfected hMSCs cells have been 
induced to differentiate into adipose or osteocytes, the results indicate 
that the transfected cells appear to be differentiating at a rate comparable 

20 to the untransfected controls and the cultures are lineage specific as 

tested by microscopic examination, FISH, Oil Red O staining (adipocyte 
assay), and calcium secretion (osteocyte assay). 

Accordingly, these results indicate that the artificial chromosomes 
(ACes) provided herein can be successfully transferred into hMSC target 

25 cells. Targeting MSCs (such as hMSCs) permits gene transfer into cells in 
an undifferentiated state where the cells are easier to expand and purify. 
The genetically modified cells can then be differentiated in vitro or 
injected into a site in vivo where the microenvironment will induce 
transformation into specific cell lineages. 



EXAMPLE 7 

Delivery of a Promoterless Marker Gene to a Platform ACes 

Platform ACes containing pSV40attPsensePURO (Figure 4) were 

constructed as set forth in Examples 3 and 4. 

A. Construction of Targeting Vectors. 

The base vector p18attBZeo (3166bp; SEQ ID NO: 114) was 

constructed by ligating the 1067bp Hind\\\-Ssp\ fragment containing 

attBZeo, obtained from pLITattBZeo (SEQ ID NO:91), into pUC18 (SEQ ID 

NO: 122) digested with HindWX and Sspl. 

1. p18attBZEO-eGFP (6119bp; SEQ ID NO: 126) was constructed 
by inserting the 2977bp Spel-H/ndW fragment from pCXeGFP (SEQ ID 
NO:71; Okabe, eta/. (1997) FEBS Lett 407:31 3-31 9) containing the eGFP 
gene into P 18attBZeo (SEQ ID NO: 114) digested with H/ndUl and Xba\. 

2. p18attBZEO-5'6XHS4eGFP (Figure 10; 7631 bp; SEQ ID NO: 
116) was constructed by ligating the 4465bp Hind\\\ fragment from 
P CXeGFPattB(6XHS4)2 (SEQ ID NO: 123), which contains the eGFP gene 
under the regulation of the chicken beta actin promoter, 6 copies of the 
HS4 core element located 5' of the chicken beta actin promoter and the 
polyadenylation signal, into the Hind\\\ site of p18attBZeo (SEQ ID NO: 
114). 

3. P 18attBZEO-3'6XHS4eGFP (Figure 11; 7600bp; SEQ ID NO: 

1 15) was created by removing the 5'6XHS4 element from p18attBZeo- 
(6XHS4)2eGFP (SEQ ID NO: 110). P 1 8attBZeo-(6XHS4)2eGFP was 

digested with FcoRV and Spe\, treated with Klenow and religated to form 

p18attBZeo3'6XHS4eGFP (SEQ ID NO: 115). 

4. p18attBZEO-(6XHS4)2eGFP (Figure 12; 9080bp; SEQ ID NO: 
110) was created in two steps. First, the EcoR\-Spe\ fragment from 
pCXeGFPattB(6XHS4)2 (SEQ ID NO: 123), which contains 6 copies of the 
HS4 core element, was ligated into p18attBZeo (SEQ ID NO: 114) 
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digested with EcoR\ and Xba\ to create p1 8attBZeo6XHS4 (461 5bp; SEQ 
ID NO: 117). Next, p1 8attBZeo6XHS4 was digested with Hin6\\\ and 
ligated to the 4465bp Hfnd\\\ fragment from pCXeGFPattB(6XHS4)2 
which contains the eGFP gene under the regulation of the chicken beta 
5 actin promoter, 6 copies of the HS4 core element located 5' of the 
chicken beta actin promoter and the polyadenylation signal- 
Table 2 



Targeting plasmid 


No. zeocin 

resistant 

clones 


No. clones with 
expected PCR 
product size 


No. clones with correct 
sequence at 
recombination junction 


p18attBZEO-eGFP 


12 


12 


NT* 


p1 8attBZEO-5'6XHS4eGFP 


1 1 


1 1 


NT 


p1 8attBZEO-3'6XHS4eGFP 


11 


1 1 


NT 


pi 8attBZEO-(6XHS4)2eGFP 


9 


9 


4/4 



*NT = not tested 



B. Transfection and Selection with Drug. 

15 The mouse cell line containing the 2 nd generation platform ACE, 

B19-38 (constructed as set forth in Example 3), was plated onto four 
10cm dishes at approximately 5 million cells per dish. The cells were 
incubated overnight in DMEM with 10% fetal calf serum at 37°C and 5% 
C0 2 - The following day the cells were transfected with 5pg of each of 

20 the 4 vectors listed in Example 7. A. above and 5/vg of pCXLamlntR (SEQ 
ID NO: 1 12), for a total of 10//g per 10cm dish. Lipofectamine Plus 
reagent was used to transfect the cells according to the manufacturers 
protocol. Two days post-transfection zeocin was added to the medium at 
500/vg/ml. The cells were maintained in selective medium until colonies 

25 formed. The colonies were then ring-cloned (see, e.g., McFarland, 2000, 
Methods Cell Sci, Mar;22(1 ):63-66). 

C. Analysis of Clones (PCR, SEQUENCING). 
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Genomic DNA was isolated from each of the candidate clones with 
the Wizard kit (Promega) and following the manufacturers protocol. The 
following primer set was used to analyze the genomic DNA isolated from 
the zeocin resistant clones: 5PacSV40 *- 
5 CTGTTAATTAACTGTGGAATGTGTG TCAGTTAGGGTG (SEQ ID NO:76); 
Antisense Zeo - TGAACAGGGTCACGTCGTCC (SEQ ID NO:77). PCR 
amplification with the above primers and genomic DNA from the site- 
specific integration of any of the 4 zeocin vectors would result in a 673bp 
PCR product. 

lO As set forth in Table 2, of the 4 zeocin resistant candidate clones 

thusfar analyzed by PCR, all 4 exhibit the correct sequence for a site- 
specific integration event. 

EXAMPLE 8 

Integration of a PCR product by site-specific recombination. 

15 In this example a gene is integrated onto the platform ACes by site- 
specific recombination without cloning said gene into a vector. 
A. PCR PRIMER DESIGN. 

PCR primers are designed to contain an attB site at the 5' end of 
one of the primers in the primer set. The remaining primers, which could 

20 be one or more than one primer, do not contain an attB site, but are 

complementary to sequences flanking the gene or genes of interest and 
any associated regulatory sequences. In first example, 2 primers (one 
containing an attB site) are used to amplify a selective gene such as 
puromycin. 

25 In a second example as shown in Figure 13, the primer set includes 

primers 1 & 2 that amplify the GFP gene without amplification of an 
upstream promoter. Primer 1 contains the attB site at the 5' end of the 
oligo. Primers 3 & 4 are designed to amplify the IRES-blasticidin DNA 
sequences from the vector pIRESblasticidin. The 5'end of primer 3 
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contains sequences complementary to the 5' end of primer 2 such that 
annealing can occur between 5' ends of the two primers. 
B. PCR REACTION AND SUBSEQUENT LIGATION TO CREATE 
CIRCULAR MOLECULES FROM THE PCR PRODUCT 
5 In the first example set forth above in Section A, the two PCR 

primers are combined with a puromycin DNA template such as pPUR 
(Clontech), a heat stable DNA polymerase and appropriate conditions for 
DNA amplification. The resulting PCR product (attB-Puromycin) is then 
then purified and self-ligated to form a circular molecule. 

10 In the second example set forth above in Section A, amplification 

of the GFP gene and IRES-blasticidin sequences is accomplished by 
combining primers 1 & 2 with DNA template pD2eGFP and primers 3 & 4 
with template pIRESblasticidin under appropriate conditions to amplify the 
desired template. After initial amplification of the two products (attB-GFP 

1 5 & IRES-blasticidin) in separate reactions, a second round of amplification 
using both of the PCR products from the first round of amplification 
together with primers 1 and 4 amplifies the fusion product attB-GFP-IRES- 
blasticidin (Figure 13). This technique of using complementary sequences 
in primer design to create a fusion product is employed in Saccharomyces 

20 cerevisiae for allele replacement (Erdeniz et al (1997) Gen Res 7:1 1 74- 
1 183). The amplified product is then purified from the PCR reaction 
mixture by standard methods and ligated to form a circular molecule. 
C. INTRODUCTION OF PCR PRODUCT ONTO THE ACes USING A 
RECOMBINASE 

25 The circular PCR product is then be introduced to the platform 

ACes using the bacteriphage lambda integrase E174R. The introduction 
can be performed in vivo by transfecting the pCXLarnlntR (SEQ ID NO: 
112) vector encoding the lambda integrase mutant E174R together with 
the circularized PCR product into a cell line containing the platform ACE. 




-121- 

D. SELECTION FOR MARKER GENE 

The marker gene (in this case either puromycin, blasticidin or GFP) 
is used to enrich the population for cells containing the proper integration 
event. A proper integration event in the second example (Figure 1 4) 
5 juxtaposes a promoter residing on the platform ACes 5' to the attB-GFP- 
IRES-Blasticidin PCR product, allowing for transcription of both GFP and 
blasticidin. If enrichment is done by drug selection, blasticidin is added to 
the medium on the transfected cells 24-48 hours post-transfection. 
Selection is maintained until colonies are formed on the plates. If 
lO enrichment is done by cell sorting, cells are sorted 2-4 days post- 
transfection to enrich for cells expressing the fluorescent marker (GFP in 
this case). 

E. ANALYSIS OF CLONES 

Clonal isolates are analyzed by PCR, FISH and sequence analysis to 
i 15 confirm proper integration events. 

EXAMPLE 9 

Construction of a human platform ACes "ACE 0.1" 

A. CONSTRUCTION OF THE TARGETING VECTOR pPACrDNA 

Genome Systems (IncyteGenomics) was supplied with the primers 
20 5'HETS (GGGCCGAAACGATCTCAACCTATT; SEQ ID NO:78), and 

3'HETS (CGCAGCGGCCCTCCTACTC; SEQ ID NO:79), which were used 
to amplify a 538bp PCR product homologous to nt 9680-10218 of the 
human rDNA sequences {GenBank Accession No. U13369 ) and used as a 
probe to screen a human genomic P1 AC (PI Artificial Chromosome) 
25 library constructed in the vector pCYPAC2 (loannou et aL (1994) Nat. 

Genet- 6(1): 84-89). Genome Systems clone #18720 was isolated in this 
screen and contains three repeats of human rDNA as assessed by 
restriction analysis. GS clone #18720, was digested with Pmel, a 
restriction enzyme unique to a single repeat of the human rDNA (45Kbp), 
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and then religated to form pPACrDNA {Figure 15). The insert in 
pPACrDNA was analyzed by restriction digests and sequence analysis of 
the 5' and 3' termini. The pPACrDNA, rDNA sequences are homologous 
to Genbank Accession #1113369, containing an insert of about 45 kB 
comprising a single repeat beginning from the end of one repeat at 
— 33980 (relative to the Genbank sequence) through the beginning of the 
next repeat up to approximately 351 20 (the repeat offset from that listed 
in the GenBank file). Thus, the rDNA sequence is just over 1 copy of the 
repeat extending from 33980 (H-/-10bp) to the end of the first repeat 
(43Kbp) and continuing into the second repeat to bp 351 20 ( + /-10bp). 

B. TRAIMSFECTION AND ACes FORMATION. 

Five hundred thousand MSU1.1 cells (Morgan et al., 1991, Exp. 
Cell Res., Nov; 1 97(1 ): 1 25-1 36; provided by Dr. Justin McCormick at 
Michigan State University) were plated per 6cm plate (3 plates total) and 
allowed to grow overnight. The cells were 70-80% confluent the 
following day. One plate was transfected with 1 5//g pPACrDNA 
(linearized with Pme I) and 2/yg pSV40attPsensePuro (linearized with Sea 
I; see Example 3). The remaining plates were controls and were 
transfected with either 20//g pBS (Stratagene) or 20/jg 
pSV40attBsensePuro (linearized with Sea I). All three plates were 
transfected using a CaPO* protocol. 

C. SELECTION OF PUROMYCIN RESISTANT COLONIES 

One day post-transfection the cells were "glycerol shocked" by the 
addition of PBS medium containing 10% glycerol for 30 seconds. 
Subsequently, the glycerol was removed and replaced with fresh DMEM. 
Four days post-transfection selective medium was added. Selective 
medium contains 1//g/ml puromycin. The transfection plates were 
maintained at 37°C with 5% C0 2 in selective medium for 2 weeks at 
which point colonies could be seen on the plate transfected with 
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pPACrDNA and pSV40attPsensePuro. The colonies were ring-cloned 
from the plate on day 17 post-selection and expanded in selective 
medium for analysis. Only two colonies (M2-2d & M2-2b) were able to 
proliferate in the selective medium after cloning. No colonies were seen 
5 on the control plates after 37 days in selective medium. 
D. ANALYSIS OF CLONES 

FISH analysis was performed on the candidate clones to detect 
ACes formation. Metaphase spreads from the candidate clones were 
probed in multiple probe combinations. In one experiment, the probes 

lO used were biotin-labeled human alphoid DNA (pPACrDNA) and 

digoxigenin-labeled mouse major DNA (pFK161) as a negative control. 
Candidate M2-2d was single cell subcloned by flow sorting and the 
candidate subclones were reanalyzed by FISH. Subclone 1B1 of M2-2d 
was determined to be a platform ACes and is also designated human 

1 5 Platform ACE 0. 1 . 

EXAMPLE lO 

Site-specific integration of a marker gene onto a human platform ACE O.I 

The promoterless delivery method was used to deliver a 
promoterless blasticidin marker gene onto the human platform ACes with 

20 excellent results. The human ACes platform with a promoterless 

blasticidin marker gene resulted in 21 of 38 blasticidin resistant clones 
displaying a PCR product of the expected size from the population co- 
transfected with pLIT38attBBSRpolyA1 0 and pCXLamlntR (Figure 8; SEQ 
ID NOs. 111 and 1 12). Whereas, the population transfected with 

25 pBlueScript resulted in 0 blasticidin resistant colonies. 

A. CONSTRUCTION OF pLKT38attB-BSRpolyA10 & pLIT38attB- 
BSRpolyA2. 

The vector pLITMUS 38 (New England Biolabs; U.S. Patent No. 
5 # 691,140; SEQ ID NO: 1 19) was digested with EcoRV and ligated to 
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two annealed oligomers, which form an attB site (attBI 5'- 
TGAAGCCTGCTTTTTTATACTAACTTGAGCGAA-3' (SEQ ID NO:8); attB2 
5'- TTCGCTCAAGTTAGTATAAAAAAGCAGGCTTCA-3'; SEQ ID NO:9). 
This ligation reaction resulted in the vector pLIT38attB (SEQ ID NO: 120). 
5 The blasticidin resistance gene and SV40 polyA site were PCR amplified 
with primers: 5 BSD (ACCATGAAAACATTTAACATTTCTCAACA; SEQ ID 
NO:80) and SV40polyA (TTTATTTGTGAAATTTGTGATGCTATTGC; SEQ 
ID NO:81) using pPAC4 (Frengen, E., et al. (2O0O) Genomics 68 (2), 1 18- 
126; GenBank Accession No. U75992) as template. The blasticidin- 

10 SV40polyA PCR product was then ligated into pLIT38attB at the BamH\ 
site, which was Klenow treated following digestion with BamH\. 
P LIT38attB-BSDpolyA10 (SEQ ID NO: 111) and pLIT38attB-BSDpolyA2 
(SEQ ID NO: 121) are the two resulting orientations of the PCR product 
ligated into the vector. 

15 B. TRANSFECTION OF MSU1.1 CELLS CONTAINING HUMAN 
PLATFORM ACE 0-1. 

MSU1.1 cells containing human platform ACE 0.1 {see Example 9) 
were expanded and plated to five 10cm dishes with 1 .3x1 0 6 cells per 
dish. The cells were incubated overnight in DMEM with 10% fetal bovine 

20 serum, at 37°C and 5% C0 2 . The following day the cells were 

transfected with 5/yg of each plasmid as set forth in Table 3, for a total of 
10/yg of DNA per plate of cells transfected (see Table 3) using ExGen 500 
in vitro transfection reagent (MBI fermentas, cat. no. R051 1). The 
transfection was performed according to the manufacturers protocol. 

25 Cells were incubated at 37°C with 5% C0 2 in DMEM with 10% fetal 
bovine serum following the transfection. 
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Table 3 



Plate # 


Plasmid 1 


Plasmid 2 


No. Bsd R Colonies 


1 


pBS 


None 


O 


2 


pCXLamlnt 


pLIT38attB- 
BSRpolyAlO 


16 


3 


pCXLamlntR 


pl_IT38attB- 
BSRpolyAlO 


40 


4 


pCXLamlnt 


pLIT38attB- 
BSRpolyA2 


28 


5 


pCXLamlntR 


pLIT38attB- 
BSRpo!yA2 


36 



IO C. SELECTION OF BLASTICIDIN RESISTANT CLONES. 

Three days following the transfection the cells were split from a 10 
cm dish to two 15cm dishes. The cells were maintained in DMEM with 
10% fetal bovine serum for 4 days in the 15 cm dishes. Seven days 
post-transfection blasticidin was introduced into the medium. Stably 

15 transfected cells were selected with 1//g/ml blasticidin. The number of 
colonies formed on each plate is listed in Table 3. These colonies were 
ring-cloned and expanded for PCR analysis. Upon expansion in blasticidin 
containing medium some clones failed to live and therefore do not have 
corresponding PCR data. 

20 D. PCR ANALYSIS 

Thirty-eight of the 40 clones from plate 3 grew after ring-cloning. 
Genomic DNA was isolated from these clones with the Promega Wizard 
Genomic cDNA purification kit, digested with EcoR\ and used as template 
in a PCR reaction with the following primers: 3BSP - TTAATTTCGGG 

25 TATATTTGAGTGGA (SEQ ID NO:82); 5PacSV40 - 

CTGTTAATTAACTGTGGAA TGTGTGTCAGTTAGGGTG (SEQ ID NO:76). 
The PCR conditions were as follows. 100ng of genomic DNA was 
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amplified with 0.5/W Herculase polymerase (Stratagene) in a 50//I reaction 
that contained 12.5pmole of each primer, 2.5mM of each dNTP, and 1X 
Herculase buffer (Stratagene). The reactions were placed in a PerkinEImer 
thermocycler programmed as follows: Initial denaturation at 95°C for 10 
5 minutes; 35 cycles of 94°C for 1 minute, 53°C for 1 minute, 72°C for 1 
minute, and 72°C for 1 minute; Final extension for 10 minutes at 72°C; 
and 4°C hold. If pLIT38attB-BSRpolyA1 0 integrates onto the human 
platform ACE 0.1 correctly, PCR amplification with the above primers 
should yield an 804bp product. Twenty-one of the 38 clones from plate 
10 3 produced a PCR product of the expected 804bp size. 

EXAMPLE 1 1 

Delivery of a Vector comprising a Promoterless Marker Gene and a gene 
encoding a therapeutic product to a Platform ACes 

Platform ACes containing pSV40attPsensePUR0 (Figure 4) were 
15 constructed as set forth in Examples 3 and 4. 

A. CONSTRUCTION OF DELIVERY VECTORS 

1. Erythropoietin cDNA vector, p18EPOcDNA. 

The erythropoietin cDNA was PCR amplified from a human cDNA 
library {E. Perkins eta/., 1999, Proc. Natl. Acad. ScL USA 96(5): 2204- 
20 2209) using the following primers: EP05XBA - 

TATCTAGAATGGGGGTGC ACGAATGTCCTGCC (SEQ ID NO: 83); 
EP03BSI - TACGTACGTCATC TGTCCCCTGTCCTGCAGGC (SEQ ID NO: 
84). The cDNA was amplified through two successive rounds of PCR 
using the following conditions: heat denaturation at 95°C for 3 minutes; 
25 35 cycles of a 30 second denaturation (95°C), 30 seconds of annealing 
(60°C), and 1 minute extension (72°C); the last cycle is followed by a 7 
minute extension at 72°C. BIO-X-ACT (BIOLINE) was used to amplify the 
erythropoietin cDNA from 2.5ng of the human cDNA library in the first 
round of amplification. Five jj\ of the first amplification product was used 
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as template for the second round of amplification. Two PCR products 
were produced from the second amplification with Taq polymerase 
(Eppendorf), each product was cloned into pCR2.1-Topo (Invitrogen) and 
sequenced. The larger PCR product contained the expected cDNA 
5 sequence for erythropoietin. The erythropoietin cDNA was moved from 
pTopoEPO into p1 8attBZeo(6XHS4)2eGFP (SEQ ID NO: 1 10). pTopoEPO 
was digested with BsiWI and Xbal to release a 588 bp EPO cDNA. BsrGI 
and BsiWI create compatable ends. The eGFP gene was removed from 
p18attBZeo(6XHS4)2eGFP by digestion with BsiWI and Xbal, the 8.3 Kbp 

10 vector backbone was gel purified and ligated to the 588 bp EPO cDNA to 
create p18EPOcDNA (SEQ ID NO: 124). 

2. Genomic erythropoietin vector, p18genEPO. 
The erythropoietin genomic clone was PCR amplified from a human 
genomic library (Clontech) using the following primers: GENEP03BSI - 

15 CGTACGTCATCTGTCCCCT GTCCTGCA (SEQ ID NO: 85); GEN EPO 

5XBA -TCTAGAATGGGGGT GCACGGTGAGTACT (SEQ ID NO: 86). The 
reaction conditions for the amplification were as follows: heat 
denaturation for 3 minutes (95°C); 30 cycles of a 30 second denaturation 
(95°C), 30 seconds annealing (from 65°C decreasing 0.5 0 C per cycle to 

20 50°C), and 3 minutes extension (72°C); 15 cycles of a 30 second 
denaturation (95°C), 30 seconds annealing (50°C), and 3 minute 
extension (72°C); the last cycle is followed by a 7 minute extension at 
72°C. The erythropoietin genomic PCR product (2147 bp) was gel 
purified and cloned into pCR2.1Topo to create pTopogenEPO. Sequence 

25 analysis revealed 2bp substitutions and insertions in the intronic 

sequences of the genomic clone of erythropoietin. A partial digest with 
Xbal and complete digest with BsiWI excised the erythropoietin genomic 
insert from pTopogenEPO. The resulting 2158 bp genomic erythropoietin 
fragment was ligated into the 8.3 Kbp fragment resulting from the 
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digestion of p1 8attBZeo(6XHS4)2eGFP (SEQ ID NO: 110) with Xbal and 
BsrGI to create P 18genEPO (SEQ ID NO: 125). 

B. TRANSFECTION AND SELECTION WITH DRUG 

The erythropoietin genomic and cDNA genes were each moved 
5 onto the platform ACes B19-38 (constructed as set forth in Example 3) by 
co-transfecting with pCXLamlntR. Control transfections were also 
performed using pCXLamlnt (SEQ ID NO: 127) together with either 
P 18EPOcDNA (SEQ ID NO: 124) or P 18genEPO (SEQ ID NO: 125). 
Lipofectamine Plus was used to transfect the DNA's into B19-38 cells 
10 according to the manufacturer's protocol. The cells were placed in 

selective medium (DMEM with 10% FBS and Zeocin @ 500ug/ml) 48 
hours post-transfection and maintained in selective medium for 13 days. 
Clones were isolated 15 days post-transfection. 

C. ANALYSIS OF CLONES (ELISA, PCR) 
15 1 . ELISA Assays 

Thirty clones were tested for erythropoietin production by an ELISA 
assay using a monoclonal anti-human erythropoietin antibody (R&D 
Systems, Catalogue # MAB287), a polyclonal anti-human erythropoietin 
antibody (R&D Systems, Catalogue # AB-286-NA) and alkaline 

20 phosphotase conjugated goat-anti-rabbit IgG (heavy and light chains) 

(Jackson ImmunoResearch Laboratories, Inc., Catalogue # 111-055-144). 
The negative control was a Zeocin resistant clone isolated from B19-38 
cells transfected with p1 8attBZeo(6XHS4) (SEQ ID NO: 117; no insert 
control vector) and pCXLamlntR (SEQ ID NO: 112). The preliminary 

25 ELISA assay was executed as follows: 1) Nunc-lmmuno Plates (MaxiSorb 
96-well, Catalogue # 439454) were coated with 75//I of a 1/20O dilution 
(in Phosphate buffered Saline, pH 7.4 (PBS), Sigma Catalogue # P-3813) 
of monoclonal anti-human erythropoietin antibody overnight at 4°C. 2) 
The following day the plates were washed 3 times with 300//I PBS 
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containing 0.15% Tween 20 (Sigma, Catalogue # P-9416). 3) The plates 
were then blocked with 300//I of 1 % Bovine Serum Albumin (BSA; Sigma 
Catalogue # A-7030) in PBS for 1 hour at 37°C. 4) Repeat the washes as 
in step 2. 5) The clonal supernatants {7Sjj\ per clone per well of 96-well 
5 plate) were then added to the plate and incubated for 1 hour at 37°C. 
The clonal supernatant analyzed in the ELISA assay had been maintained 
on the cells 7 days prior to analysis. 6) Repeat the washes of step 2. 7} 
Add 75//! of polyclonal anti-human erythropoietin antibody (1/250 dilution 
in dilution buffer (0.5% BSA, 0.01 % Tween 20, 1X PBS, pH 7.4) and 

10 incubate 1 hour at 37°C. 8) Repeat washes of step 2. 9) Add 75//I of 

goat anti-rabbit conjugated alkaline phosphatase diluted 1/4000 in dilution 
buffer and incubate 1 hour at 37°C. 10) Repeat washes of step 2. 11) 
Add 75/vl substrate, p-nitrophenyl phosphate {Sigma N2640), diluted to 
Img/ml in substrate buffer (0.1 Ethanolamine-HCI (Sigma, Catalogue # E- 

15 6133), 5mM MgCI 2 (Sigma, Catalogue # M-2393), pH 9.8). Incubate the 
plates in the dark for 1 hour at room temperature (22°C). 1 2) Read the 
absorption at 405nm (reference wavelength 495nm) on an Universal 
Microplate Reader (Bio-Tek instruments. Inc., model # ELX800 UV). The 
erythropoietin standard curve was derived from readings of diluted human 

20 recombinant Erythropoietin (Roche, catalogue # 1-120-166; dilution range 
125 - 7.8mUnits/ml). From this preliminary assay the 21 clones 
displaying the highest expression of erythropoietin were analyzed a 
second time in the same manner using medium supernatants that had 
been on the clones for 24 hours and a 1:3 dilution therof . 

25 2. PCR Analysis 

Genomic DNA was isolated from the 21 clones with the best 
expression (as assessed by the initial ELISA assay above) as well as the 
B19-38 cell line and used for PCR analysis. Genomic DNA was isolated 
using the Wizard genomic DNA purification kit (Promega) according to the 
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manufacturers protocol. Amplification was performed on 100ng of 
-- genomic DNA as template with MasterTaq DNA Polymerase (Eppendorf) 
and the primer set 5PacSV40 - CTGTTAATTAACTGTGGAATGTGTG 
TCAGTTAGGGTG (SEQ ID NO: 76) and Antisense Zeo - 
5 TGAACAGGGTCACGTCGTCC (SEQ ID NO: 77). The amplification 

conditions were as follows: heat denaturation for 3 minutes (95°C); 30 
cycles of a 30 second denaturation (95°C), 30 seconds annealing (from 
65oC decreasing O.BoC per cycle to 50°C), and 1 minutes extension 
(72°C); 15 cycles of a 30 second denaturation (95°C), 30 seconds 
10 annealing (50°C), and 1 minute extension (72°C); the last cycle is 

followed by a 10 minute extension at 72°C. PCR products were size 
separated by gel electrophoresis. Of the 21 clones analyzed 19 produced 
a PCR product of 650 bp as expected for a site-specific integration event. 
All nineteen clones were the result of transformations with p19EPOcDNA 
15 (5) or p18genEPO (14) and pCXLamlntR (i.e. mutant integrase). The 
remaining two clones, both of which were the result of transformation 
with p18genEPO (SEQ ID NO: 125) and pCXLamlnt (i.e. wildtype 
integrase; SEQ ID NO: 127), produced a 400 bp PCR product. 



20 Preparation of a Transformation Vector Useful for the Induction of Plant 
Artificial Chromosome Formation 

Plant artificial chromosomes (PACs) can be generated by 
introducing nucleic acid, such as DNA, which can include a targeting 
DNA, for example rDNA or lambda DNA, into a plant cell, allowing the cell 
25 to grow, and then identifying from among the resulting cells those that 
include a chromosome with a structure that is distinct from that of any 
chromosome that existed in the cell prior to introduction of the nucleic 
acid. The structure of a PAC reflects amplification of chromosomal DNA, 
for example, segmented, repeat region-containing and heterochromatic 
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structures. It is also possible to select cells that contain structures that 
are precursors to PACs, for example, chromosomes containing more than 
one centromere and/or fragments thereof, and culture and/or manipulate 
them to ultimately generate a PAC within the cell. 
5 In the method of generating PACs, the nucleic acid can be 

introduced into a variety of plant cells. The nucleic acid can include 
targeting DNA and/or a plant expressable DNA encoding one or multiple 
selectable markers (e.g., DNA encoding bialophos (bar) resistance) or 
scorable markers (e.g., DNA encoding GFP). Examples of targeting DNA 

10 include, but are not limited to, N. tabacum rDNA intergenic spacer 

sequence (IGS) and Arabidopsis rDNA such as the 1 8S, 5.8S, 26S rDNA 
and/or the intergenic spacer sequence. The DNA can be introduced using 
a variety of methods, including, but not limited to Agrobacterium- 
mediated methods, PEG-mediated DNA uptake and electroporation using, 

15 for example, standard procedures according to Hartmann et al [(1998) 
Plant Molecular Biology 35:741], The cell into which such DNA is 
introduced can be grown under selective conditions and can initially be 
grown under non-selective conditions and then transferred to selective 
media. The cells or protoplasts can be placed on plates containing a 

20 selection agent to grow, for example, individual calli. Resistant calli can 
be scored for scorable marker expression. Metaphase spreads of resistant 
cultures can be prepared, and the metaphase chromosomes examined by 
FISH analysis using specific probes in order to detect amplification of 
regions of the chromosomes. Cells that have artificial chromosomes with 

25 functioning centromeres or artificial chromosomal intermediate structures, 
including, but not limited to, dicentric chromosomes, formerly dicentric 
chromosomes, minichromosomes, heterochromatin structures (e.g. 
sausage chromosomes), and stable self-replicating artificial chromosomal 
intermediates as described herein, are 
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identified and cultured. In particular, the cells containing self-replicating 
artificial chromosomes are identified. 

The DNA introduced into a plant cell for the generation of PACs 
can be in any form, including in the form of a vector. An exemplary 
5 vector for use in methods of generating PACs can be prepared as follows. 

For the production of artificial chromosomes, plant transformation 
vectors, as exemplified by pAglla and pAgllb, containing a selectable 
marker, a targeting sequence, and a scorable marker were constructed 
using procedures well known in the art to combine the various fragments. 
10 The vectors can be prepared using vector pAg1 as a base vector and 
inserting the following DNA fragments into pAg1: DNA encoding p- 
glucoronidase under the control of the nopaline synthase <NOS> promoter 
fragment and flanked at the 3' end by the NOS terminator fragment, a 
fragment of mouse satellite DNA and an N. tabacum rDNA intergenic 
15 spacer sequence (IGS). In constructing plant transformation vectors, 
vector pAg2 can also be used as the base vector. 
1. Construction of p AG 1 

Vector pAg1 (SEQ. ID. NO: 89) is a derivative of the CAMBIA 
vector named pCambia 3300 (Center for the Application of Molecular 
20 Biology to International Agriculture, i.e., CAMBIA, Canberra, Australia; 
www.cambia.org), which is a modified version of vector pCambia 1300 
to which has been added DNA from the bar -gene confering resistance to 
phosphinothricin. The nucleotide sequence of pCambia 3300 is provided 
in SEQ. ID. NO: 90. pCambia 3300 also contains a lacZ alpha sequence 
25 containing a poly linker region. 

pAg1 was constructed by inserting two new functional DNA 
fragments into the polylinker of pCambia 3300: one sequence containing 
an attB site and a promoterless zeomycin resistance-encoding DNA 
flanked at the 3' end by a SV40 polyA signal sequence, and a second 
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sequence containing DNIA from the hygromycin resistance gene 
(hygromycin phosphotransferase) confering resistance to hygromycin for 
selection in plants. Although the zeomycin-SV40 polyA signal fusion is 
not expected to function in plant cells, it can be activated in mammalian 
5 cells by insertion of a functional promoter element into the attB site by 
site-specific recombination catalyzed by the Lambda att integrase. Thus, 
the inclusion of the attB-zeomycin sequences allows for evaluation of 
functionality of plant artificial chromosomes in mammalian cells by 
activation of the zeomycin resistance-encoding DNA, and provides an att 

10 site for further insertion of new DNA sequences into plant artificial 

chromosomes formed as a result of using pAg1 for plant transformation. 
The second functional DNA fragment allows for selection of plant cells 
with hygromycin. Thus, pAgl contains DNA from the bar gene confering 
resisance to phosphinothricin, DNA from the hygromycin resistance gene, 

15 both resistance-encoding DNAs under the control of a separate 

cauliflower mosaic virus (CaMV) 35S promoter, and the attB-promoterless 
zeomycin resistance-encoding DNA. 

pAgl is a binary vector containing Agrobacterium right and left T- 
DNA border sequences for use in Agrobacterium-mediated transformation 

20 of plant cells or protoplasts with the DNA located between the border 
sequences. pAgl also contains the pBR322 Ori for replication in E.colL 
pAgl was constructed by ligating M/?dlll/PsfI-digested p3300attBZeo 
with jW/7cmi/Fsfl-digested pBSCaMV35SHyg as follows, 
a. Generation of p3300attBZeo 

25 Plasmid pCambia 3300 was digested with PstMEcH 36 II and ligated 

with Pst\/Stu\-d\gested pLITattBZeo (the nucleotide sequence of 
pLITattBZeo is provided in SEQ. ID. NO: 91), which contains DNA 
encoding the zeocin resistance gene and an attB Integrase recognition 
sequence, to generate p3300attBZeo which contains an attB site, a 

30 promoterless 
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zeomycin resistance-encoding DNA flanked at the 3' end by a SV40 
poiyA signal, and a reconstructed Pst\ site. 

b. Generation of pBSCaMV35SHyg 

A DNA fragment containing DNA encoding hygromycin 
phosphotransferase flanked by the CaMV 35S promoter and the CaMV 
35S polyA signal sequence was obtained by PCR amplification of plasmid 
pCambia 1302 (GenBank Accession No. AF234298 and SEQ. ID. NO: 
92). The primers used in the amplification reaction were as follows: 
CaMV35SpolyA: 

5'-CTGAATTAACGCCGAATTAATTCGGGGGATCTG-3' SEQ. ID. NO: 93 
CaMV35Spr: 

5'-CTAGAGCAGCTTGCCAACATGGTGGAGCA-3' SEQ. ID. NO: 94 
The 2100-bp PCR fragment was ligated with £coRV-digested pBluescript 
II SK+ (Stratagene, La Jolla, CA, U.S.A.) to generate pBSCaMV35SHyg. 

c. Generation of pAgl 

To generate pAgl, pBSCaMV35SHyg was digested with 
Hfnd\\\IPst\ and ligated with H/ndl\\/Pst\-di\gested p3300attBZeo. Thus, 
pAgl contains the pCambia 3300 backbone with DNA conferring 
resistance to phophinothricin and hygromycin under the control of 
separate CaMV 35S promoters, an attB-promoterless zeomycin 
resistance-encoding DNA recombination cassette and unique sites for 
adding additional markers, e.g., DNA encoding GFP. The attB site can be 
used as decribed herein for the addition of new DNA sequences to plant 
artificial chromosomes, including PACs formed as a result of using the 
pAgl vector, or derivatives thereof, in the production of PACs. The attB 
site provides a convenient site for recombinase-mediated insertion of 
DNAs containing a homologous att site. 
2. pAG2 



r 
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The vector pAg2 (SEQ. ID. NO: 95) is a derivative of vector pAg1 
formed by adding DNA encoding a green fluorescent protein (GFP), under 
the control of a NOS promoter and flanked at the 3' end by a NOS polyA 
signal, to pAg1 . pAg2 was constructed as follows. A DNA fragment 
containing the NOS promoter was obtained by digestion of pGEM-T-NOS, 
or pG EM Easy NOS (SEQ. ID. NO: 96), containing the NOS promoter in the 
cloning vector pGEM-T-Easy (Promega Biotech, Madison, Wl, U.S.A.), 
with Xba\INco\ and was ligated to a Xba\INco\ fragment of pCambia 1302 
containing DNA encoding GFP (without the CaMV 35S promoter) to 
generate p1302NOS (SEQ. ID. NO: 97) containing GFP-encoding DNA in 
operable association with the NOS promoter. Plasmid p1302NOS was 
digested with Smal/Bs/Wl to yield a fragment containing the NOS 
promoter and GFP-encoding DNA. The fragment was ligated with 
P/]?el/fe/WI-digested pAgl to generate pAg2. Thus, pAg2 contains DNA 
from the bar gene confering resistance to phosphinothricin, DNA 
conferring resistance to hygromycin, both resistance-encoding DNAs 
under the control of a cauliflower mosaic virus 35S promoter, DNA 
encoding kanamycin resistance, a GFP gene under the control of a NOS 
promoter and the attB-zeomycin resistance-encoding DNA. One of skill in 
the art will appreciate that other fragments can be used to generate the 
pAgl and pAg2 derivatives and that other heterlogous DNA can be 
incorporated into pAgl and pAg2 derivatives using methods well known 
in the art. 

3. pAglla and pAgllb transformation vectors 

Vectors pAglla and pAgllb were constructed by inserting the 
following DNA fragments into pAgl: DNA encoding #-g!ucoronidase, the 
nopaline synthase terminator fragment, the nopaline synthase (NOS) 
promoter fragment, a fragment of mouse satellite DNA and an N. tabacum 
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rDNA intergenic spacer sequence (IGS). The construction of pAglla and 
pAgilb was as follows. 

An N. tabacum rDNA intergenic spacer (IGS) sequence (SEQ. ID. 
NO: 98; see also GenBank Accession No. Y08422; see also Borysyuk et 
5 al. (20OO) Nature Biotechnology 75:1303-1306; Borysyuk eta/. (1997) 
Plant MoL tf/o/.3S:655-660; U.S. Patent Nos. 6,100,092 and 6,355,860) 
was obtained by PCR amplification of tobacco genomic DNA. The IGS 
can be used as a targeting sequence by virtue of its homology to tobacco 
rDNA genes; the sequence is also an amplification promoter sequence in 
10 plants. This fragment was amplified using standard PCR conditions {e.g. , 
as described by Promega Biotech, Madison, Wl, U.S.A.) from tobacco 
genomic DNA using the primers shown below: 
NTIGS-FI 

5'- GTG CTA GCC AAT GTT TAA CAA GAT G- 3' (SEQ ID No. 99) and 
1 5 NTIGS-RI 

5'-ATG TCT TAA AAA AAA AAA CCC AAG TGA C- 3' (SEQ ID No. 100) 
Following amplification, the fragment was cloned into pGEM-T Easy to 
give pIGS-l A fragment of mouse satellite DNA (Msatl fragment; 
GenBank Accession No. V00846; and SEQ ID No. 101) was amplified via 
20 PCR from pSAT-1 using the following primers: 
MSAT-F1 

5'_ AAT ACC GCG GAA GCT TGA CCT GGA ATA TCG C -3'(SEQ ID No. 

102) and 

MSAT-Ri 

25 5'-ATA ACC GCG GAG TCC TTC AGT GTG CA T- 3' (SEQ ID No. 103) 

This amplification added a Sacll and a Hind\\\ site at the 5'end and a SacW 
site at the 3' end of the PCR fragment. This fragment was then cloned 
into the Sacll site in plGS-1 to give pMIGS-1, providing a eukaryotic 
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centromere-specific DNA and a convenient DNA sequence for detection 
via FISH. 

A functional marker gene containing a NOS-promoter:GUS:NOS 
terminator fusion was then constructed containing the NOS promoter 
5 (GenBank Accession No. U09365; SEQ ID No. 104), E. coli 

^-glucuronidase coding sequence (from the GUS gene; GenBank 
Accession No. S69414; and SEQ ID No. 105), and the nopaiine synthase 
terminator sequence {GenBank Accession No. U09365; SEQ ID No. 107). 
The NOS promoter in pGEM-T-NOS was added to a promoterless GUS 

10 gene in pBlueScript (Stratagene, La Jolla, CA, U.S.A.) using Not\ISpe\ to 
form pNGN-1, which has the NOS promoter in the opposite orientation 
relative to the GUS gene. 

pMIGS-1 was digested with Not\/Spe\ to yield a fragment 
containing the mouse major satellite DNA and the tobacco IGS which was 

15 then added to /Vofl-digested pNGN-1 to yield pNGN-2. The NOS promoter 
was then re-oriented to provide a functional GUS gene, yielding pNGN-3, 
by digestion and religation with Spel. Plasmid pNGN-3 was then digested 
with Hind\\\, and the Hind\\\ fragment containing the /^glucuronidase 
coding sequence and the rDNA intergenic spacer, along with the Msat 

20 sequence, was added to pAG-1 to form pAglla (SEQ ID NO: 108), using 
the unique Hind\\\ site in pAg1 located near the right T-DNA border of 
pAgl, within the T-DNA region. 

' Another plasmid vector, referred to as pAgllb, was also recovered, 
which contained the inserted Hindlll fragment (SEQ ID NO: 108) in the 

25 opposite orientation relative to that observed in pAglla. Thus, pAglla and 
pAgllb differ only in the orientation of the Hindlll fragment containing the 
mouse major satellite sequence, the GUS DNA sequence and the IGS 
sequence. The nucleotide sequence of pAglla is provided in SEQ. ID. NO: 
109. 
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Since modifications will be apparent to those of skill in this art, it is 
intended that this invention be limited only by the scope of the appended 
claims. 
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WHAT IS CLAIMED IS: 

1 . A eukaryotic chromosome comprising one or a plurality of att 
site(s), wherein: 

an att site is heterologous to the chromosome; and 
5 an att site permits site-directed integration in the presence of 

lambda integrase. 

2. The eukaryotic chromosome of claim 1, wherein the att sites 
are selected from the group consisting of attP and attB or attL and attR, 
or variants thereof. 

10 3. The eukaryotic chromosome of claim 1 that is an artificial 

chromosome. 

4. The eukaryotic chromosome of claim 1 that is an artificial 
chromosome expression system (ACes). 

5. The eukaryotic chromosome of claim 4 that is predominantly 
1 5 heterochromatin. 

6. The chromosome of claim 1 that is an artificial chromosome 
that contains no more than about 30%, 40%, 50%, 60%, 70%, 80%, 
90% or 95% euchromatin. 

7. The chromosome of claim 1 that is a plant chromosome. 
20 8. The chromosome of claim 1 that is an animal chromosome. 

9. The chromosome of claim 7 that is a plant artificial 
chromosome. 

1 0. The chromosome of claim 8 that is an animal artificial 
chromosome. 

25 11. The chromosome of claim 8 that is a mammalian 

chromosome. 

1 2. The chromosome of claim 1 1 that is a mammalian artificial 
chromosome. 
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13. The chromosome of claim 6 that is an artificial chromosome 
expression system (ACes). 

14. A platform artificial chromosome expression system {ACes) 
comprising one or a plurality of sites that participate in recombinase 

5 catalyzed recombination. 

15. The ACes of claim 14 that contains one site. 

16. The ACes of claim 14 that is predominantly heterochromatin. 

17. The ACes of claim 14 that contains no more than about 
30%, 40%, 50%, 60%, 70%, 80%, 90% or 95% euchromatin. 

10 18. The ACes of claim 14 that is a plant ACes, 

19. The ACes of claim 14 that is an animal ACes. 

20. The ACes of claim 14 that is selected from a fish, insect, 
reptile, amphibian, arachnid or a mammalian ACes. 

21. The ACes of claim 14 that is a fish ACes. 

15 22. The artificial chromosome expression system {ACes) of claim 

. 1 4, wherein the recombinase and site(s) are from the Cre/lox system of 
bacteriophage P1, the int/att system of lambda phage, the FLP/FRT 
system of yeast, the Gin/gix recombinase system of phage Mu, the Cin 
recombinase system, the Pin recombinase system of E. coli and the R/RS 
20 system of the pSR1 plasmid, or any combination thereof. 

23. A method of introducing heterologous nucfeic acid into a 
chromosome, comprising: 

contacting a chromosome of any of claims 1 or 14 with a nucleic 
acid molecule comprising both the heterologous nucleic acid and a 
25 recombination site, in the presence of a recombinase that promotes 

recombination between the sites in the chromosome and in the nucleic 
acid molecule. 
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24. The method of claim 23, wherein the recombinase is 
selected from the group consisting of Cre, Gin, Cin, Pin, FLP, a phage 
integrase and R from the pSR1 plasmid. 

25. The method of claim 23, wherein the nucleic acid molecule 
5 encodes a therapeutic protein, antisense nucleic acid, or comprises an 

artificial chromosome. 

26. The method of claim 25, wherein the nucleic acid molecule 
comprises a yeast artificial chromosomes (YAC), a bacterial artificial 
chromosome (BAC) or an insect artificial chromosome (I AC). 

10 27. A combination, comprising, the chromosome of claim 1 and a 

first vector comprising the cognate recombination site, wherein the 
cognate recombination site is a site that recombines with the site 
engineered into the chromosome. 

28. The combination of claim 27, further comprising nucleic acid 
1 5 encoding a recombinase, wherein the nucleic acid is on a second vector 

or on the first vector, or on the ACes under an inducible promoter. 

29. The combination of claim 28, wherein the recombinase and 
sites are from the Cre/lox system of bacteriophage PI , the int/att system 
of lambda phage, the FLP/FRT system of yeast, the Gin/gix recombinase 

20 system of phage Mu, the Pin recombinase system of E. coli and the R/RS 
system of the pSR1 plasmid, or any combination thereof. 

30. The combination of claim 28, wherein a vector is the plasmid 
pCXLamlntR. 

31. The combination of claim 27, wherein a vector is the plasmid 
25 pDsRedN1-attB. 

32. A kit, comprising the combination of claim 27 and optionally 
instructions for introducing heterologous nucleic acid into the 
chromosome. 
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33. A method for introducing heterologous nucleic acid into a 
platform artificial chromosome, comprising: 

(a) mixing an artificial chromosome comprising at least a first 
recombination site and a vector comprising at least a second 

5 recombination site and the heterologous nucleic acid; 

(b) incubating the resulting mixture in the presence of at least one 
recombination protein under conditions whereby recombination between 
the first and second recombination sites is effected, thereby introducing 
the heterologous nucleic acid into the artificial chromosome. 

10 34. The method of claim 33, wherein the artificial chromosome is 

an ACes. 

35. The method of claim 33, wherein said mixing step (a) is 
conducted in cells ex vivo. 

36. The method of claim 33, wherein said mixing step (a) is 
15 conducted extracellularly in an in vitro reaction mixture. 

37. The method of claim 33, wherein the at least one 
recombination protein is encoded by a bacteriophage selected from the 
group consisting of bacteriophage lambda, phi 80, P22, P2, 186, P4 and 
PI. 

20 38. The method of claim 37, wherein the at least one 

recombination protein is encoded by bacteriophage lambda, or mutants 
thereof. 

39. The method of claim 33, wherein at least one recombination 
protein is selected from the group consisting of Int, IHF, Xis and Cre, y6, 

25 Tn3 resolvase, Hin, Gin, Cin and Flp. 

40. The method of claim 32, wherein the recombination sites are 
selected from the group consisting of att and lox P sites. 
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41 . The method of claim 33, wherein the first and/or second 
recombination site contains at least one mutation that removes one or 
more stop codons. 

42. The method of claim 33, wherein the first and/or second 
recombination site contains at least one mutation that avoids hairpin 
formation. 

43. The method of claim 33, wherein the first and/or second 
recombination site comprises at least a first nucleic acid sequence 
selected from the group consisting of SEQ ID NOs:41-56: 

a) RKYCWGCTTTYKTRTACNAASTSGB (m-att) (SEQ ID NO:41); 

b) AGCCWGCTTTYKTRTACNAACTSGB (m-attB) (SEQ ID NO:42); 

c) G TTC A G CTTTCKTRT ACN A A CTSG B (m-attR) (SEQ ID NO:43); 

d) AGCCWGCTTTCKTRTACNAAGTSGB (m-attL) (SEQ ID NO:44); 

e) GTTCAGCTTTYKTRTACNAAGTSGB (m-attP1) (SEQ ID NO:45); 

f) AG C CTGCTTTTTTG TA C A A A CTTG T (attBD (SEQ ID NO:46); 

g) AGCCTGCTTTCTTGTACAAACTTGT (attB2) (SEQ ID NO:47); 

h) ACCCAGCTTTCTTGTACAAACTTGT (attB3) (SEQ ID NO:48); 

i) GTTCAGCTTTTTTGTACAAACTTGT (attRD (SEQ ID NO:49); 
j) GTTCAGCTTTCTTGTACAAACTTGT (attR2) (SEQ ID NO:50); 
k) G TTC AG CTTTCTTGTAC A A A G TTG G (attR3) (SEQ ID NO:51); 
I) AGCCTGCTTTTTTGTACAAAGTTGG (attLD (SEQ ID NO:52); 
m) AGCCTGCTTTCTTGTACAAAGTTGG (attL2) (SEQ ID NO:53); 
n) ACCCAGCTTTCTTGTACAAAGTTGG (attL3) (SEQ ID NO:54); 
o) GTTCAGCTTTTTTGTACAAAGTTGG (attPD (SEQ ID NO:55>; 
p) GTTCAGCTTTCTTGTACAAAGTTGG (attP2, P3) (SEQ ID NO: 

56); 

and a corresponding or complementary DNA or RIMA sequence, 
wherein R = A or G, K = G or T/U, Y = C or T/U, W=A or T/U, N = A or C 
or G or T/U, S = C or G, and B = C or G or T/U; and 



-144- 

the core region does not contain a stop codon in one or more 
reading frames. 

44. The method of claim 33, wherein the first and/or second 
recombination site comprises at least a first nucleic acid sequence 

5 selected from the group consisting of a mutated att recombination site 
containing at least one mutation that enhances recombinational 
specificity, a complementary DNA sequence thereto, and an RNA 
sequence corresponding thereto. 

45. The method of claim 33, wherein the vector comprising the 
1 0 second site further encodes at least one selectable marker. 

46. The method of claim 45, wherein the marker is a 
promoterless marker, which, upon recombination is under the control of a 
promoter and is thereby expressed. 

47. The method of claim 46, wherein the first recombination site 
15 is attP and is in the sense orientation prior to recombination. 

48. The method of claim 46, wherein the selectable marker is 
selected from the group consisting of an antibiotic resistance gene, and a 
detectable protein, wherein the detectable protein is chromogenic, 
fluorescent, or capable of being bound by an antibody and FACs sorted. 

20 49. The method of claim 48, wherein the selectable marker is 

selected from the group consisting of green fluorescent protein (GFP), red 
fluorescent protein (RFP), blue fluorescent protein (BFP), and E. cofi 
histidinol dehydrogenase (hisD). 

50. A cell comprising, the chromosome of claim 1 . 

25 51 . The cell of claim 50, wherein the cell is a nuclear donor cell. 

52. The cell of claim 50, wherein the cell is a stem cell. 

53. The stem cell of claim 52, wherein said stem cell is human 
and is selected from the group consisting of a mesenchymal stem cell, a 
hematopoietic stem cell, an adult stem cell and an embryonic stem cell. 
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54. The cell of claim 50, wherein the cell is mammalian. 

55. The cell of claim 54, wherein the mammal is selected from 
the group consisting of humans, primates, cattle, pigs, rabbits, goats, 
sheep, mice, rats, guinea pigs, hamsters, cats, dogs, and horses. 

56. The cell of claim 50, wherein the cell is a plant cell. 

57. A cell comprising the platform ACes of claim 14. 

58. The cell of claim 57, wherein the cell is a nuclear donor cell. 

59. The cell of claim 57, wherein the cell is a stem cell. 

60. The stem cell of claim 59, wherein said stem cell is human 
and is selected from the group consisting of a mesenchymal stem cell, a 
hematopoietic stem cell, an adult stem cell and an embryonic stem cell. 

61. A human mesenchymal cell comprising an artificial 
chromosome. 

62. The human mesenchymal cell of claim 61, wherein said 
artificial chromosome is an ACes. 

63. The human mesenchymal cell of claim 62, wherein the ACes 
is a platform-,4Ces. 

64. A method for introducing heterologous nucleic acid into the 
mesenchymal cell of claim 63, comprising: 

(a) introducing into the cell of claim 63, wherein the platform->4 Ces 
has a first recombination site, a vector comprising at least a second 
recombination site and the heterologous nucleic acid; 

(b) incubating the resulting mixture in the presence of at least one 
recombination protein under conditions whereby recombination between 
the first and second recombination sites is effected, thereby introducing 
the heterologous nucleic acid into the platform-^ Ces within the 
mesenchymal cell. 

65. A lambda-intR mutein comprising a glutamic acid to arginine 
change at position 174 of wild-type lambda-intR. 



• 
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66. The lambda-intR mutein of claim 65, wherein the lambda-intR 
mutein comprises SEQ ID NO:37. 

67. The method of claim 46 wherein the promoterless marker is 
transcriptionally downstream of the heterologous nucleic acid, wherein 

5 the heterologous nucleic acid encodes a heterologous protein, and 

wherein the expression level of the selectable marker is transcriptionally 
linked to the expression level of the heterologous protein. 

68. The method of claim 67 / wherein the selectable marker and 
the heterologous nucleic acid are transcriptionally linked by the presence 

10 of a IRES between them. 

69. The method of claim 68, wherein the selectable marker is 
selected from the group consisting of an antibiotic resistance gene, and a 
detectable protein, wherein the detectable protein is chromogenic or 
fluorescent. 

15 70. The method of claim 69, wherein the selectable marker is 

selected from the group consisting of green fluorescent protein (GFP), red 
fluorescent protein (RFP), blue fluorescent protein (BFP), and E. co/i 
histidinol dehydrogenase. 

71. The method of claim 67 further comprising expressing the 
20 heterologous protein and isolating the heterologous protein. 

72. A method for producing a transgenic animal, comprising 
introducing a platform-/* Ces into an embryonic cell. 

73. The method of claim 72, wherein the embryonic cell is a 
stem cell. 

25 74. The method of claim 72, wherein the embryonic cell is in an 

embryo. 

75. The method of claim 72, wherein the platform-/* Ces 
comprises heterologous nucleic acid that encodes a therapeutic product. 
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76. The method of claim 72, wherein the transgenic animal is a 
fish, insect, reptile, amphibians, arachnid or mammal. 

77. The method of claim 72, wherein the ACes is introduced by 
cell fusion, lipid-mediated transfection by a carrier system, microinjection, 

5 microcell fusion, electroporation, microprojectile bombardment or direct 
DNA transfer. 

78. A transgenic animal produced by the method of claim 72. 

79. A cell line useful for making a library of ACes, comprising a 
multiplicity of heterologous recombination sites randomly integrated 

10 throughout the endogenous chromosomes. 

80. A method of making a library of ACes comprising random 
portions of a genome, comprising introducing one or more ACes into the 
cell line of claim 79, under conditions that promote the site-specific 
chromosomal arm exchange of the ACes into, and out of, a multiplicity of 

15 the heterologous recombination sites within the cell's chromosomal DNA; 
and isolating said multiplicity of ACes, thereby producing a library of 
ACes whereby multiple ACes have different portions of the genome 
within. 

81. A library of cells useful for genomic screening, said library 
20 comprising a multiplicity of cells, wherein each cell comprises an ACes 

having a mutually exclusive portion of a chromosomal nucleic acid 
therein. 

82. The library of cells of claim 81, wherein the cells of the 
library are from a different species than the chromosomal nucleic acid 

25 within the ACes. 

83. A method of making one or more cell lines, comprising 

a) integrating into endogenous chromosomal DNA of a selected cell 
species, a multiplicity of heterologous recombination sites. 
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b) introducing a multiplicity of ACes under conditions that promote 
the site-specific chromosomal arm exchange of the ACes into, and out of, 
a multiplicity of the heterologous recombination sites integrated within the 
cell's endogenous chromosomal DNA; 
5 c) isolating said multiplicity of ACes, thereby producing a library of 

ACes whereby a multiplicity of ACes have mutually exclusive portions of 
the endogenous chromosomal DNA therein; 

d) introducing the isolated multiplicity of ACes of step c) into a 
multiplicity of cells, thereby creating a library of cells; 
10 e) selecting different cells having mutually exclusive ACes therein 

and clonally expanding or differentiating said different cells into clonal cell 
cultures, thereby creating one or more cell lines. 

84. The method of claim 23, wherein the nucleic acid molecule 
with a recombination site is a PCR product. 
15 85. Method of claim 23 wherein the recombinase is a protein and 

the recombination event occurs in vitro. 

86. The method of claim 33, wherein the vector is a PCR 
product comprising a second recombination site. 

87. The lambda-intR mutein of claim 65, wherein the mutein 
20 further comprises an amino acid signal for nuclear localization. 

88. The lambda-intR mutein of claim 65, wherein the mutein 
further comprises an epitope tag for protein purification. 

89. A modified iron-induced promoter comprising SEQ ID 
NO:128. 

25 90. A plasmid or expression cassette comprising the promoter of 

claim 89. 

91. A vector, comprising: 

a recognition site for recombination; and 
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a sequence of nucleotides that targets the vector to an 
amplifiable region of a chromosome. 

92. The vector of claim 91, wherein the amplifiable region 
comprises heterochromatic nucleic acid. 
5 93. The vector of claim 91, wherein the amplifiable region 

comprises rDNA. 

94. The vector of claim 93, wherein the rDNA comprises an 
intergenic spacer. 

95. The vector of claim 91, further comprising nucleic acid 
10 encoding a selectable marker that is not operably associated with any 

promoter. 

96. The vector of claim 91, wherein the chromosome is a 
mammalian chromosome. 

97. The vector of claim 91, wherein the chromosome is a plant 
1 5 chromosome. 

98. A cell of claim 57 that is a plant cell, wherein the ACes 
platform is a MAC. 

99. The plant cell of claim 98, wherein the MAC comprises 
transcriptional regulatory sequence of nucleotides derived from plants. 

20 100. The plant cell of claim 99, wherein the regulatory sequence 

is selected from the group consisting of promoters, terminators, 
enhancers, silencers and transcription factor binding sites. 

101 . A cell of claim 57 that is an animal cell, wherein the ACes 
platform is a plant artificial chromosome (PAC). 

25 102. The cell of claim 101 that is a mammalian cell. 

103. The cell of claim 98, wherein the MAC comprises 
transcriptional regulatory sequence of nucleotides derived from plants. 

104. The cell of claim 102, wherein the MAC comprises 
transcriptional regulatory sequence of nucleotides derived from plants. 
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105. The cell of claim 104, wherein the regulatory sequence is 
selected from the group consisting of promoters, terminators, enhancers, 
silencers and transcription factor binding sites. 

106. A method, comprising: 

5 introducing a vector of claim 91 into a cell; 

growing the cells; and 

selecting a cell comprising an artificial chromosome that comprises 
one or more repeat regions. 

107. The method of claim 106, wherein sufficient portion of the 
1 O vector integrates into a chromosome in the cell to result in amplification 

of chromosomal DNA. 

108. The method of claim 106, wherein the artificial chromsome 
is an ACes. 

109. A method for screening, comprising: 

15 contacting a cell comprising a reporter ACes with test compounds 

or known compounds, wherein: 

the reporter ACes comprises one or a plurality of reporter 
constructs; 

a reporter construct comprises a reporter gene in operative linkage 
20 with a regulatory region responsive to test or known compounds; and 

detecting any increase or decrease in signal output from the 
reporter, wherein a change in the signal is indicative of activity of the test 
or known compound on the regulatory region. 

110. The method of claim 109, wherein the reporter is dperatively 
25 linked to a promoter that controls expression of a gene in a signal 

transduction pathway, whereby activation or reduction in the signal 
indicates that the pathway is activated or down-regulated by the test 
compound. 
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111. The method of claim 109, wherein the reporter in the 
construct encodes drug resistance or encodes a fluorescent protein. 

112. The method of claim 111, wherein the fluorescent protein is 
selected from the group consisting of red, green and blue fluorescent 

5 proteins. 

113. The method of claim 109, wherein the ACes comprises a 
plurality of reporter-linked constructs, each with a different reporter, 
whereby the pathway(s) affected by the test compounds can be 
elucidated. 

10 114. The method of claim 109, wherein a reporter is operatively 

linked to a promoter that is transcriptionally regulated in resopnse to DNA 

damage, and the test compounds are genotoxicants. 

115. The method of claim 114, wherein the DNA damage is 

induced by apoptosis, necrosis or cell-cycle perturbations. 
15 116. The method of claim 114, wherein unknown compounds are 

screened to assess whether they are genotoxicants. 

117. The method of claim 114, wherein the promoter is a 
cytochrome P450-profiled promoter. 

118. The method of claim 114, wherein the cell is in a transgenic 
20 animal and toxicity is assessed in the animal. 

119. The method of claim 109, wherein: 

the cell is a patient cell sample; the patient has a disease; 

the regulatory region-is one targeted by a drug or drug regimen; 

and 

25 the method assesses the effectiveness of a treatment for the 

disease for the particular patient. 

120. The method of claim 119, wherein the cell is a tumor cell. 

121. The method of claim 109, wherein the cell is a stem cell or a 
progenitor cell, whereby expression of the reporter is operatively linked to 



-152- 



a regulatory region exprssed in the cells to thereby identify stem cells or 
progenitor cell. 

122. The method of claim 109, wherein the cell is in an animal; 
and the method comprises whole-body imaging to monitor expression of 
the reporter in the animal. 

123. A reporter ACes comprises one or a plurality of reporter 
constructs, wherein the reporter construct comprises a reporter gene in 
operative linkage with a regulatory region responsive to test or known 
compounds. 
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SEQUENCE LISTING 

<110> CHROMOS MOLECULAR SYSTEMS, INC. 
Perkins , Edward 
Perez , Carl 
Lindenbaum, Michael 
Greene , Amy 
Leung, Josephine 
Fleming, Elena 
Stewart, Sandra 
She Hard, Joan 



<12 0> CHROMOSOME -BASED PLATFORMS 

<130> 24601-420PC 

<14 0> Not Yet Assigned 
<141> Herewith 

<150> 60/294,758 
<151> 2001-05-30 

<150> 60/366,891 
<151> 2002-03-21 

<160> 129 

<170> FastSEQ for Windows Version 4.0 

<210> 1 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer.- attPUP 
<400> 1 

cc ttgcgcta atgctctgtt acagg 25 

<210> 2 

<211> 26 

<212> DNA 

<213> Artificial Sequence 
<220> 

<22 3> Primer: attPDWN 

<400> 2 

cagaggcagg gagtgggaca aaattg 2 6 

<210> 3 

<211> 35 

<212> DNA 

<213> Artificial Sequence 
<220> 

<22 3> Primer: Lamint 1 

<400> 3 

ttcgaattca tgggaagaag gcgaagtcat gagcg 35 

<210> '4 
<211> 34 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Primer: Lamint 2 



<400> 4 

ttcgaattct tatttgattt caattttgtc ccac 



34 



<210> 5 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 
<400> 5 

cggacaatgc ggttgtgcgt 2 0 

<2lo> 6 
<211> 46 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223 > primer 



<210> 7 

<211> 46 

<212> DNA 

<213> Artificial Sequence 
<22G> 

< 2 2 3 > LambdalNTERl 7 4 rev 

<400> 7 

cgtcagccgt aagtcttgat ctccttactc tagattttgc tgcgcg 4 6 

<210> 8 

<211> 33 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attBl 



<210> 9 
<211> 33 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attB2 
<400> 9 

ttcgctcaag ttagtataaa aaagcaggct tea 33 

<210> 10 

<211> 25 

<212> DNA 

<213> Artificial Sequence 



<400> 6 

cgcgcagcaa aatctagagt aaggagatca agacttaegg ctgacg 



46 



<400> 8 

tgaagcctgc ttttttatac taacttgagc gaa 



33 



<220> 




<223> Primer: attPdwn2 
<40O> 10 

tcttctcggg cataagtcgg acacc 

<210> 11 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer : CMVen 
<400> 11 

ctcacgggga tttccaagtc tccac 

<210> 12 
<211> 26 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer : at tPdwn 
<400> 12 

cagaggcagg gagtgggaca aaattg 

<210> 13 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer : CMVEN2 
<400> 13 

caactccgcc ccattgacgc aaatg 

<210> 14 
<211> 26 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer :L1 
<400> 14 

agtatcgccg aacgattagc tcttca 

<210> 15 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer : Fl rev 
<400> 15 

gccgatttcg gcctattggt taaa 

<210> 16 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer: RED 
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25 



25 



26 



25 



26 
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<400> 16 

ccgccgacat ccccgactac aagaa 

<210> 17 
<211> 25 
<212> DMA 

<213> Artificial Sequence 
<220> 

<223> Primer :L 2 rev 
<400> 17 

ttccttcgaa ggggatccgc ctacc 

<210> 18 

<211> 22118 

<212> DMA 

<213> Mus raus cuius 



<300> 

<308> GenBank X82564 
<309> 1996-04-09 



<400> 18 

gaattcccct 

aaaaccctgt 

ctgagtgata 

cagacagaca 

acaccactct 

tctgtctgtc 

cctgcctgcc 

cctcctaagt 

tccttccttc 

ttctttcctt 

tgtcttgaag 

ctggcatgaa 

aggagttcca 

atttcaccaa 

aaagtaggag 

tcaccattct 

acctggaaac 

gttagcagac 

caaccgagtc 

cagagaaacc 

ttaaaaatag 

gggaggattt 

gctatacaga 

aatataaaat 

gagatggcaa 

tgtcataaaa 

gaatcatatg 

agatagggtt 

ggtagcctca 

ctgcctgcct 

tttatttctt 

ctttctttct 

tgcctatagg 

tcctgagaat 

atatgccgag 

tgtcttttat 

agaccaggct 

aaaggcatgt 

ctttctttct 

tttctttttt 

aattgcctca 

cagtatgtat 

aaattcatgt 



atccctaatc 
aggatcttca 
ggtcctggga 
gacagacgt t 
ggccataatt 
tgtctgtctg 
tgcctgcctg 
ttgccttttt 
cttccttcct 
cttacattta 
acactttgta 
tgttgtacct 
agaagactgg 
aagaatttag 
aaaaacgtga 
gcacttgcaa 
aataggtcac 
aagatggctg 
acagaacaag 
acatcttgaa 
ccgggagtga 
ctgagtttga 
gaaaccctgt 
aaaaatttta 
gtaactgcaa 
tccaatgtgc 
tctgaaaata 
tctctcagtg 
aactcagaga 
gcctgcctca 
tctctttctc 
ttcttattca 
cctgcttgcc 
aagtgaaaaa 
gctgtcagag 
ccaaacacag 
ggccttgaac 
gccaccactg 
ttttctctct 
cttttttttt 
gctctgctct 
gtatgtatat 
cattcttgtt 



cagattggtg 
ctctaggtca 
catatgcagt 
acaaacaaac 
attgaggacg 
tctgtctgtc 
cctacacaga 
tctctttctt 
tccttccttt 
ttcttttcat 
ggcctcaatc 
cactatgacc 
ttatattttt 
actgaccaat 
ggctgtctgt 
accgggccac 
atgaaggcca 
ccatgcacat 
gaagtataca 
aaaaacaaaa 
tggcgcatgt 
ggccagcctg 
cttgaaaact 
aagaatttta 
tcatagcaga 
cttcatgatg 
aaagccagaa 
tatccctggc 
ggtcctctct 
cttcttctgc 
tcttctttct 
attagttttc 
aggagagggc 
acaacaaaaa 
tgctttttaa 
aagagaggtg 
acattaatct 
cccggactga 
ctctttcttc 
ttttttttaa 
aattctcttt 
ttagaagaaa 
ccacaaagtg 



gaataacttg 
ctgttcagca 
tctgcacaga 
acgttgagcc 
ttgatttatt 
tatcaaacca 
gaaatgattt 
tatctttttc 
ctttctttct 
acatagtttc 
ctgtaagagc 
agcttagtct 
catttattat 
tcagagtctg 
ggatggtcga 
tagaacccgg 
gccacctcca 
gttgtctttc 
cagtgagttc 
aaataaatta 
ctttaatccc 
gtctgcaaag 
aaactaaatt 
aaaaactaca 
aatattatac 
atcaaatttc 
ccttttctgc 
atccctgcct 
gcctgcctgc 
cacccacaca 
ttctttcttt 
aatgtaagtg 
aacagaacct 
aaggaaattc 
ggcttagtgt 
gctcggcctg 
gtctgcctct 
tttcttcttt 
cttccttcct 
aatttgccta 
aaaaaaaaac 
tactaatcca 
agttccagga 



gtatagatgt 
ctggaacctg 
cagacaga ca 
gtgtgccaac 
attctgtgtt 
aaagaaacca 
cttcaatcaa 
ttttttcttt 
ttctttcttt 
ttagtgtaag 
cttcctctgc 
tcaagtctga 
tgcattttaa 
ccgtttaaaa 
ggctgctt ta 
tgaagggaga 
tcttgttgtg 
agcttggtga 
caggtcagcc 
aataaatata 
agctctct tc 
tgagttccag 
aaactaas ct 
gaaatcaaac 
acacacacac 
gatagtcagt 
ttttgttttc 
ggaacttcct 
ctgcctgcct 
accgagtcga 
ctttctttct 
tgtgtttgtg 
aggagaaa c c 
taatcacata 
aagtaatgaa 
catgtctgtt 
gcttccct aa 
tttttttt tt 
ttctttctat 
aggttaaagg 
aaacaaaaaa 
ttaataactc 
cttaccagag 



ttgtgcatta 
aattgtggcc 
gacagacaga 
acacacacaa 
tgtgagtctg 
aacaattatg 
tctaaaacga 
tcttcttcct 
cttactttct 
catccctgac 
ttttcaaatg 
gttactggaa 
ttaaaattta 
gcataaggaa 
gggagcctcg 
aaccaaagcg 
cgggagttca 
ggtcaaagta 
agagtttaca 
atttaaaaat 
aggcagagat 
gacagtcagg 
aaactaaaaa 
ataagcccac 
acacagactc 
aatactagaa 
ttttgcccca 
ttgtaggttt 
gcctgcctgc 
acctaggatc 
ttctttcttt 
ctctatctgc 
accatgcagc 
gaatgtagat 
aattgttgtg 
gtctgcatgt 
tgctgcgatt 
tggaaaatac 
tctttttttc 
tgtgctccac 
aaaaccaaaa 
ttttttccta 
aaaccctgtg 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 



ttcaaatttc 
ctacacagaa 
acacacacac 
aagtcgtgcc 
tactcctaga 
aactctgaat 
acgggcgggc 
accccaagcg 
ccaccctcct 
ctgtgcctaa 
caagaacgat 
gtctagcccg 
agtggtgggt 
agaccctccg 
ggtttgtatg 
acgctccagg 
agggtgacag 
gacggtctct 
gcccttttgg 
agtcctaccc 
caccgggggc 
tgtggctcgg 
gaagccttgt 
gggccccggc 
tttttttttt 
tctgaggccg 
gcttcgggtt 
tacttctgag 
ctggagcttt 
cgggggcacc 
ggcggggcca 
cgtcaccggg 
ggatgtcgcc 
tcttgtcccc 
cttccaagcc 
cagaagcctt 
atgggcccgg 
tttttttttt 
gatgccgaga 
tttggatctt 
caccttacat 
gtcagctgga 
accggtggca 
gtggcccggt 
cctct tgtcc 
ggcttccagg 
tttttcctcc 
gggaaagcta 
tgtcagggtc 
gggccacctc 
tctcttttat 
cacgctgtcc 
gctgttttgc 
ctgt ccccga 
gcagcttgtg 
cccgaggtgt 
gccsccttat 
tcttttctct 
ttcttttttt 
tggtgtccaa 
cgttgtgttc 
acattcctat 
ggtgctccgg 
atggcgaatg 
cgtctgccgg 
cgcacttttc 
tpacgtgttt 



tgtgttcaag 
aaaccatatc 
acacacacac 
taaaataaat 
aaaaataaat 
ttagtcttgg 
gggcgggtga 
gtagagtgtt 
cttccactgc 
ctgtgcctgt 
tttgcctgtt 
ttcgctatgt 
gggtacgctg 
gagagacaga 
gttgatcgag 
cctctcaggt 
gaggccgggc 
aacaaggagg 
gaaaaatgct 
ccccccccct 
accgtacatc 
ccagctggcg 
ctgtcgctgt 
ttccaagccg 
tttttttctc 
agaggacgcg 
tttttttttc 
gccgagagga 
ggatcfctttt 
ttacatctga 
gctggagctt 
ggcgctgtac 
cggtcagctg 
gtcaccgggg 
gatgtggccc 
gtctgtcgct 
cttccaagcc 
ttcctccaga 
ggacgcgatg 
tttttttttt 
ctgaggccta 
gctttggatc 
ctgtacatct 
cagctggagc 
ctgtcaccgg 
ccgatgtggc 
agaagccctc 
tgggcgcggt 
gaccagttgt 
cccaggtatg 
gcttgtgatc 
tttccctatt 
ttgtccagcc 
gccacgcttc 
acaactgggc 
cgttgtcaca 
ttcggctcac 
tcccggtctt 
tttttttttt 
gtgttcatgc 
tcttgttctg 
ctcgcttgtt 
agttctcttc 
gcggccgctc 
tggtgtgtgg 
tcagtggttc 
cactttggtc 



gtcaccctgg 
t cagaaaaaa 
acacacacac 
atttttctgg 
acaaacgggc 
aaaagggggc 
gtggccggcg 
ttaaaaatga 
ttagatgctc 
tccctcaccc 
ttcaccgctc 
tcgggcggga 
ctccgtcgtg 
atgagtgagt 
accattgtcg 
tggtgacaca 
aagcaggcgg 
tcgtacaggg 
agggttggtg 
tttttttttt 
tgaggccgag 
cttcgggtct 
caccgggggc 
gtgtggctcg 
cagaagcctt 
atgggtcggc 
ctccagaagc 
cgtgatgggc 
tttttttttt 
gggcgagagg 
cgggtttttt 
ttctgaggcc 
gagctttgga 
gcaccgtaca 
ggccagctgg 
gtcacccggg 
ggtgtggctc 
aaccttgtct 
ggcccgtctt 
ttttcctcca 
gaggacacga 
tttttttttt 
gaggcggaga 
tttggatctt 
tggcacggta 
ccggtcagct 
tctgtccctg 
tttctttcat 
tcctttgagg 
acttccaggc 
ttttctatct 
aacactaaag 
tattcttttt 
ctgctttccc 
gctgtgactt 
cctgtcccgg 
tttttttttt 
tcttccacat 
ttggggaggt 
cacgtgcctc 
tgtctgcccg 
tctcccgatt 
gggccagggc 
ttctcgttct 
aaggcagggg 
gcgtggtcct 
gtgtctcgct 



cttacaaagt 
aaaaagttcc 
acacacacag 
ccaaagtgaa 
tttttaatca 

gggtgtgggt 

gcggtggcag 
gacctaaatg 
ccttcccctt 
cgctgattcg 
cctgtcatac 
cgatggggac 
cgtgcgtgag 
gaatgtggcg 
ggcgacacct 
ggagagggaa 
gagcgtctcg 
agatggccaa 
gcaacgttac 
tttcctccag 
aggacgcgat 
tttttttttt 
gctgtacttc 
gccagctgga 
gtctgtcgct 
ttccaagccg 
cctctcttgt 
ccgggttcca 
cctccagaag 
acgtgatggg 
ttttttcctc 
gagaggacgt 
tcattttttt 
tctgaggccg 
agcttcgggt 
gcgctgtact 
ggccagctgg 
gtcgctgtca 
ccaggccgat 
gaagccctct 
tgggcccggg 
ttttcttcca 
ggacattatg 
attttttttt 
catctgaggc 
ggagctttgg 
tcaccggggg 
tgacctgtcg 
tccggttctt 
gtcgttgctc 
gttcctattg 
gacactataa 
actggcttgg 
gggcttgctg 
tgctgcgtgt 
ttggaatggt 
tttttttctc 
gcctcccgag 
ggagagtccc 
ccgagtgcac 
tatcagtaac 
gcgcgtcgtt 
caagccgcgc 
gccagcgggc 
tgcggctctc 
tgtggatgtg 
tgaccatgtt 



gagttccaag 

aaacacacac 

cgcgccgcgg 

agcaaatcac 

ttccagcact 

gagtgagggc 

cgagcaccag 

tggtggaacg 

actgtgctcc 

ccagcgacgt 

tttcgttttt 

cgtttgtgcc 

tgccggaacc 

gcgcgtgacg 

agtggtgaca 

gtgcctgtgg 

gagatggtgt 

agcagaccga 

taggtcgacc 

aagccctctc 

gggcccggct 

tttttttttt 

tgaggccgag 

gcttcgggtc 

gtcaccgggg 

atgtggcggg 

ccccgtcacc 

ggcggatgtc 

ccctctcttg 

tccggcttcc 

cagaagccct 

gatgggcccg 

ttttccctcc 

agaggacacg 

cttttttttt 

tctgaggccg 

agcttcgggt 

cccggggcgc 

gtggcccggt 

cttgtccccg 

ttccaggccg 

gaagccctct 

ggcccggctt 

taattttttc 

cgagaggaca 

atcttttttt 

ccctgtacgt 

gtcttatcag 

ttcgttatgg 

gcctgtcact 

gacctggaga 

agagaccctt 

gtctgtcgcg 

cttgcgtgtg 

cagacgtttt 

ggagccagct 

ttggagtccc 

tgcatttctt 

gagtacttca 

ttttttttgt 

tgtcttgccc 

gctcactctt 

caggcgaggg 

cctcgtctct 

cggcccgacg 

tgaggcgccc 

cccagagtcg 



tccgataggg 
acacacacac 
cgatgagggg 
tatgaagagg 
gttttaattt 
gagcgagcag 
aaaacaacaa 
gaggt cgccg 
cttcccctaa 
actttgactt 

gggtgcccga 

actcgggaga 
tgagctcggg 
gatctgtatt 
agtttcggga 
tgaggcgacc 
cgtgtttaag 
gttgctgtac 
agaaggctta 
ttgtccccgt 
tccaagccgg 
ttttcctcca 
aggacgcgat 
tttttttttt 
gcgctgtact 
gccagctgga 

gggggcgctg 

gcccggtcag 
tccccgtcac 
aagccgatgt 
ctcttgtccc 
ggttccaggc 
agaagccctc 
atgggcctgt 
ttttttcctc 
agaggacgcg 
cttttttttt 
ttgtacttct 
cagctggagc 
tcaccggggg 
atgtggcccg 
tgtccccgtc 
ccaatccgat 
ttccagaagc 
ttatgggccc 
ttttttttct 
ctgaggccga 
ttctccgggt 
ggtcattttt 
ttcctccctg 
taggtactga 
tcgatttaag 
gtgcctgaag 
cttgctgtgg 
tcccgatttc 
gtggttgagg 
gaacctccgc 
tttgtttttt 
ctcctgtctg 
ggcagtcgct 
cgcgtg[taag 
agatcgatgt 
acggacattc 
ccaccccatc 
ctgccccgcg 
ggttgtgccc 
gtggatgtgg 



2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

4020 

4080 

4140 

4200 

4260 

4320 

4380 

4440 

4500 

4560 

4620 

4680 

4740 

4800 

4860 

4920 

4980 

5040 

5100 

5160 

5220 

5280 

5340 

5400 

5460 

5520 

5580 

5640 

5700 

5760 

5820 

5880 

5940 

60O0 

6060 

6120 

6180 

6240 

6300 

6360 

6420 

6480 

6540 

6600 
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ccggtggcgt 

gaggtgctcc 

gcgctcccca 

tgtctgagaa 

tcgtcgggfcg 

ggtcgcggct 

gagaggcctg 

aatgccctfcg 

ttggtcttct 

gtcggggtfct 

ggaaagggtg 

ctcgccccet 

ggcctccccg 

ccgttgctgc 

gcacaccccc 

tgggtaggcg 

tccgtcgcgt 

ctgcgccgcg 

ccccccttcc 

cctcggggtc 

gttctgfcggg 

gccgctcggg 

cggtgtcgcc 

ggtgtggtgg 

tgccctgacc 

gaggggcccg 

ccccctcccc 

acccgtggcc 

cggtcaccgg 

gagctgtggt 

gagagggcfcg 

agtggtcatt 

ccggccctgt 

accctggcgg 

gatgtctacc 

cctcgttcct 

catctctcgc 

tcgccggggg 

ctcgccggct 

gacgttgcgc 

gagcccctgc 

tgtgtcgcgt 

gacgggtggc 

tcgttggtgt 

tcgccggtgt 

cggcccggtg 

gggacggagg 

gttggcfcfctg 

fcccggccgca 

cctcccgcga 

cctggtcctg 

ggtagcatat 

agtgaaactg 

ctacttggat 

tcccgggggg 

ctccggccgg 

acgccccccg 

tcgccgtgcc 

gage ct gaga 

cgacccgggg 

ggaatgagtc 

cagccgcggt 

tagttggatc 

ccccttgcct 

gtttactttg 

aggaataatg 

taagagggac 



tgcataccct 
tggagcgttc 
ttccctggtg 
gecegtgaga 
aggcgcccac 

ggggttggaa 

gefcttegggg 

gaagagaacc 

ggfcttccctg 

tgggtccgtc 

egggcttett 

gaccgcctcc 

ctccgagttc 

ggagcatgtg 

gcgtgcgcgt 

acggtgggct 

gcgtccctct 

cgtggtgcgt 

cgcggcagcg 

gagagggtcc 

agaaeggctg 

ggtcttcgtc 

tcctcgggct 

gaetgetcag 

ggtccgacgc 

tttcggccgc 

gctcgccgca 

gtgctgtcgg 

ggtcttgggg 

ttggagggcg 

cgtgcgaggg 

gtcccgacgg 

cgtccgtcgg 

tgggattaac 

fcccctcfcccc 

ccctctcgcg 

geaatggege 

ctggccgctg 

tcgcggactc 

ctcgctgctg 

cgcacccgcc 

egggagegtg 

ctatccaggg 

ggggagtgaa 

cgcgcttctc 

eggfc cgacg t 

ggagagcggg 

ccgcgtgcgt 

tgcactctcc 

ggctctccgc 

tcccaccccc 

gctfcgtctca 

cgaatggctc 

aactgtggta 

ggatgcgtgc 

gggtcgggcg 

tggeggegae 

taccatggtg 

aacggctacc 

aggtagtgac 

cactttaaat 

aattccagct 

ttgggagcgg 

ctcggcgccc 

aaaaaattag 

gaataggacc 

ggceggggge 



tcccgtctgg 
caggtttgtc 
tgcctccggt 

ggggggtcga 

cccgcgacta 

agtfctctcga 

gggaccggtt 

ttcctgttgc 

tgtgctcgtc 

ccgccctcag 

aeggtctega 

cgcgcgcgca 

ggggagggat 

geteggcttg 

actttcctcc 

cccgggtccc 

cgctcgcgtc 

gctgfcgtgct 

ttcccacggc 

gtgtctggcg 

ttggccgcgt 

ggtaggcatc 

cccggggggc 

gggagtggtg 

ccgagcggtc 

ccttgccgtc 

geeggtcttt 

accccccgca 

gggggecgag 

tcccggcccc 

gaaaaggttg 

tgtggfcggtc 

gaaggcgcgt 

cccgcgcgcg 

cgaggtctca 

gggfcteaagt 

cgcccgagtt 

tccggtctct 

ctggcttcgc 

tgtgcttggg 

ggtgtgcggt 

tccgcctcgc 

ctcgcccccg 

tggfcgctacc 

tttccgccaa 

tccggctctc 

taagagaggt 

gtgetcgegg 

cgttccgcgc 

cgccgccgcc 

gacgctccgc 

aagattaagc 

attaaatcag 

attctagagc 

atttatcaga 

ccggcggctt 

gacccattcg 

accaegggtg 

acatccaagg 

gaaaaat aac 

cctttaacga 

ecaatagegt 

gcgggcggtc 

cctcgatgct 

agtgttcaaa 

gcggttctat 

attegtattg 



tgtgfcgcacg 
tcctaggtgc 
gctccgtctg 
ggagagaagg 
gtacgccfcgt 
gagactcatt 
gcagggtctc 
cgcagacccc 
gcatgcatcc 
tgagaaagtt 
ggggtctctc 
gcgtttgctc 
cacgcggggc 
tgtggttggt 
cctcctgagg 
cacccgtctt 
cacgactttg 
tetegggctg 
tggegaaate 
ttgattgatc 
ccggcgcgac 
ggtgtgtcgg 
cgtcgtgttt 
cagtgtgatt 
tctcggtccc 
gtcgccggcc 
tttcctctct 
tgggggegge 
gggtaagaaa 
gcggccgtgg 
ccccgcgagg 
tgttggccga 
gtfcggggcct 
tgtcccggtg 
ggccttctcc 
cgctcgtcga 
cacggtgggt 
cctgcccgac 
ccggagggtc 

gggggcccgc 

ttcgcgccgc 
ggeggctaga 
ccgacccccg 
ggtcattccc 
cccccacgcc 
ccgatgccga 
gteggagage 
acgggfcttfcg 
gagcgcccgc 
tcctcctcct 
tcgcgcttcc 
catgcatgtc 
ttatggttcc 
taatacatgc 
tcaaaaccaa 
ggtgacfccta 
aacgtctgcc 
aeggggaate 
aaggcagcag 
aatacaggac 
ggatccattg 
atattaaagt 
cgccgcgagg 
cttagctgag 
gcaggcccga 
tttgttggfct 
cgccgctaga 



cgctgtttct 

ctgcttctga 

gctgtgtgcc 

aggggcaaga 

gegtaggget 

gctttcccgfc 

ccctgtccgc 

cccgcgcggt 

tetcteggtg 

tccttctcta 

ccgaatggtc 

tctcgtctac 

agagcctgtc 

ggctggggag 

gccgccgtgc 

cccgtgcctc 

gccgctcccg 

tgtggttgtg 

gegggagtec 

tcgcfcctcgg 

gtcggacgtg 

catcggtctc 

egggtegget 

cccgccggtt 

ttgtgaggac 

ctcgttctgc 

cccccccfccfc 

egggcaegt a 

gteggctegg 

cggtgtcttg 

gc aaagggaa 

ggtgcgtctg 

gccggagtgc 

tggcggtggg 

gcgcgggctc 

cctcccctcc 

tcgtcctccg 

ccccgttggc 

agggggcttc 

tgcggcctcc 

ggfccagttgg 

cgcgggtgtc 

cctgcccgtc 

tcccgcgtgg 

aacccaccac 

ggggttcggg 

tgtcccgggg 

tcggaccccg 

ccggctcacc 

ctctcgcgct 

ttacctggtt 

taagtacgea 

ttfcggfccgct 

egaegggeg c 

cccggtgagc 

gataacctcg 

ctatcaactt 

agggttcgat 

gcgcgcaaa t 

tctttcgagg 

gagggcaag t 

tgctgcagtt 

cgagtcaccg 

tgtcccgcgg 

gccgcctgga 

ttcggaactg 

ggtgaaattc 



tgtaagcgtc 
gctggtggtg 
ttcccgtttg 
ccccccttct 
ggtgctgagc 
ggggagc 1 1 1 
ggatgctcag 
cgcccgcgtg 
gccggggcfcc 
gctatcttcc 
ccctggaggg 

cgcggcccgc 

tgtcgfccctg 
agggctccgt 
ggacggggtg 
acccgtgcct 
cgacggcggc 
tcgcctcgcc 
tccttcccct 
ggaegggace 
gggacccact 
tctcfccgtgt 
cggcgctgca 
ttgcctcgcg 
ccccttccgg 
tgtgtcgttc 
cc tcfcgactg 
cgcgtccggg 
egggegggag 
cgcggtcttg 
agaggctagc 

gggggctcgt 

cgaggtgggt 
ggctccggtc 
tcggccctcc 
tccgtccttc 
cctccgcttc 
gtggtcttct 
ccggttcccc 
gcccgcccgt 
gccctggcgt 
gccgggctcc 
ccggtggtgg 
tttgactgtc 
ccfcgctct.ee 
atttgtgccg 
cgacgctcgg 
aeggggtegg 
cccggtttgt 
ctctgtcccg 
gatcctgcca 
cggccggtac 
cgctcctctc 
tgacccccct 
tccctcccgg 
ggcega t cgc 
tcgatggtag 
teeggagagg 
tacccactcc 
ccctgtaatt 
ctggtgccag 
aaaaagctcg 
cccgtccccg 
ggcccgaagc 
taccgcagct 
aggecatgat 
ttggaccggc 



6660 
6720 
6780 
6840 
6900 
6960 
7020 
7080 
7140 
7200 
7260 
7320 
7380 
7440 
7500 
7560 
7620 
7680 
7740 
7800 
7860 
7920 
7980 
8040 
8100 
8160 
8220 
8280 
8340 
8400 
8460 
8520 
8580 
8640 
8700 
8760 
8820 
8880 
8940 
9000 
9060 
9120 
9180 
9240 
9300 
9360 
9420 
9480 
9540 
9600 
9660 
9720 
9780 
9840 
9900 
9960 
10020 
10O80 
10140 
10200 
10260 
10320 
10380 
10440 
10500 
10560 
10620 
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gcaagacgga 
tcggaggttc 
cgatgcggcg 
ggttccgggg 
ccaggagtgg 
cggacaggat 
fcfccfctagttg 
ctaactagtt 
gcgttcagcc 
tgcacgcgcg 
aacccgttga 
gaattcccag 
accgcccgtc 
ggtcggccca 
agtaaaagtc 
ctgtggagga 
cgcgtgcgtc 
gaaggggtgg 
tcccctctcc 
gcgfccfcfcgcc 
ggtttttgac 
cccatccccg 
ggatgfcgagt 
gtcctccccg 
coccgggggg 
cggtcgttcg 
cccgaggcgg 
cccgacccgc 
gggttcccgt 
cacgtgtctc 
cctctctctc 
cgtgagttcg 
tgcgtcgatg 
catcgacact 
cgtcggttga 
ctcgcagggc 
gggcggfctgt 
cgcgctcgcg 
gcctcgcgtc 
tgggaaccca 
gaggttggcg 
ggttgtcggg 
gfctfcgggtct 
ggcgccgcgc 
gtatccccgg 
cctcggtggg 
cgtggctctt 
ccgcgggacg 
sggagggaga 
ctgtgggcfcg 
ccctcccgcc 
gccgggtgcc 
tgtcccccct 
attagtcagc 
gaagagccca 
gacccactcc 
tggacggtgt 
gttgcfctggg 
cgagaccgat 
tcaagagggc 
gatfccaaccc 
ccccgttcct 
gcctccggcg 
Sggtcggcgg 
ggcggtgcgc 

gggggggcgg 

ggccgcgctt 



ccagagcgaa 
gaagacgatc 
gcgttattcc 
ggagtatggt 
gcctgcggct 
tgacagattg 
gtggagcgat 
acgcgacccc 
acccgagatt 
ctacacfcgac 
accccattcg 
taagtgcggg 
gctactaccg 
cggccctggc 
gtaacaaggt 
gcggcggcgt 
ccgggtcccg 

gtggggtcgg 
ctcgtccggc 
tctttcccgt 
ccgtcccggg 
ccgcggctct 
gtcgcgtgtg 
ctcctgtccc 
gtcgccctgc 
ggcggctctc 
cggtcgtgtg 
gccgccggct 
gtcgttcccg 
gtttcgttcc 
cggggagagg 
ctcacacccg 
aagaacgcag 
tcgaacgcac 
cgatcaatcg 
caacccccca 
cggtgtggcg 
gcttcttccc 
ggcgcctccc 
ccgcgccccc 
gttgagggtg 
gtggcggtcg 
tgcgctgggg 
accctccggc 
tggcgttgcg 
cgccttcgcg 
cttcgtctcc 
ccgcggcgtc 
gggcctcgct 
tgcgtcccgg 
ggcctctcgg 
gtctctttcc 
ttctgaccgc 
ggaggaaaag 
gcgccgaatc 
ccggcgccgc 
gaggccggta 
aatgcagccc 
agtcaacaag 
gtgaaaccgt 
ggcggcgcgc 
cccgacccct 
gcgggcgcgg 
gggaccgccc 
cgcgaccggc 
cgcgtctcag 
tcgccgaatc 



agcatttgcc 
agataccgtc 
catgacccgc 
tgcaaagctg 
taattfcgact 
atagctcttt 
ttgtctggtt 
cgagcggtcg 
gagcaafcaac 
tggctcagcg 
tgatggggat 
tcataagctt 
atfcggatggt 
ggagcgctga 
ttccgtaggt 
ggcccgctct 
fccgcccgcgfc 
tctgggtccg 
tctgacctcg 
ccggctcttc 
ggcgttcggt 
ggcttttcta 
ggctcgcccg 
gggtacctag 
cgcccccagg 
cctcagactc 

ggggggtgga 

tgcccgattt 
tgtttttccg 
tgctggccgg 
agggcggtgg 
aaataccgat 
ctagctgcga 
ttgcggcccc 
cgtcacccgc 
acccgggtcg 
cgcgcgcccg 
gctccgccgt 
ggaccgctgc 
gtggcgcccg 
tgcgtgcgcc 
acgagggccg 
gaggcggggt 
ttgtgtggag 
agggagggtt 
ccgcacgcgg 
gcttctcctt 
cgtgcgccga 
gacccgttgc 

gggttgcgtg 

ggaccccctg 
cgcccgcctc 
gacctcagat 
aaactaacca 
cccgccgcgc 
tcgtgggggg 
gcggccccgg 
aaagcgggtg 
taccgtaagg 
taagaggtaa 
gtccggccgt 
ccacccgcgc 

ggggtggtgt 

ccggccggcg 
tccgggacgg 
ggcgcgccga 
ccggggccga 



aagaatgttt 
gtagttccga 
cgggcagctt 
aaacttaaag 
caacacggga 
ctcgattccg 
aattccgata 
gcgtccccca 
aggtctgtga 
tgtgcctacc 
cggggattgc 
gcgttgatta 
ttagtgaggc 
gaagacggtc 
gaacctgcgg 
ccccgtcttg 
gtggagcgag 
tctgggaccg 
ccaccctacc 
cgtgtctacg 
cgtcggggcg 
cgfctggctgg 
tcccgatgcc 
ctgtcgcgtt 
gtcggggggc 
catgaccctc 
tgtctggagc 
ccgcgggtcg 
ctcccgaccc 
cctgaggcta 
tcgttggggg 
acgactctta 
gaattaatgt 
gggttcctcc 
tgcggtgggt 
ggccctccgt 

cgbcgcggag 

tcccgccctc 
ctcaccagtc 
ggggtgggcg 
gaggtggtgg 
gtcggtcgcc 
cgaccgctcg 
ggagagcgag 
fcggcgfccccg 
ccgctagggg 
cacccgggcg 
tgcgagtcac 
gfccccggctt 
tgagtaagat 
agacggttcg 
ctcgctctct 
cagacgtggc 
ggattccctc 
gtcgcggcgt 
cccaagtcct 
cgcgccgggc 
gtaaactcca 
gaaagttgaa 
acgggtgggg 
gcccggtggt 
gtcgttcccc 
ggtggtggcg 
accggccgcc 
ccgggaaggc 
accacctcac 
ggaagccaga 



tcattaatca 
ccataaacga 
ccgggaaacc 
gaafctgacgg 
aacctcaccc 
tgggtggtgg 
acgaacgaga 
acttcttaga 
tgcccttaga 
ctgcgccggc 
aattattccc 
agtccctgcc 
cctcggatcg 
gaacttgact 
aaggatcatt 
tgtgtgtcct 
gtgtctggag 
cctccgattt 
gcggcggcgg 
aggggcggta 
cgcgctttgc 

ggcggttgtc 

acgcttttct 
ccggcgcgga 
ggtggggccc 
ctccccccgc 
cccctcgggc 
gtcctgtcgg 
tttttttttc 
cccctcggtc 
actgtgccgt 
gcggtggatc 
gaattgcagg 

cggggctacg 

gctgcgcggc 
ctcccgaagt 
cctggtctcc 
gcccgtgcac 
tttctcggtc 
cgtccgcatc 
tcggtcccct 
tgcggtggfct 
cggggttggc 
ggcgagaacg 
cgtccgfcccg 
cggtcggggc 
gtacccgctc 
ccccgggtgt 
ccctgggggg 
cctccacccc 
ccggctcgtc 
tcttcccgcg 
gacccgctga 
agtaacggcg 
gggaaafcgtg 
tctgatcgag 
tcgggtcttc 
tctaaggcta 
aagaactttg 
tccgcgcagt 
cccggcggat 
tcttcctccc 
cgcgggcggg 
gccgggcgca 
ccggfcgggga 
cccgagtgtt 
tacccgtcgc 



agaacgaaag 
tgccgactgg 
aaagtctttg 
aagggcacca 
ggcccggaca 
tgcatggccg 
ctctggcatg 
gggacaagtg 
tgtccggggc 
aggcgcgggt 
catgaacgag 
ctttgtacac 
gccccgccgg 
atctagagga 
aaacgggaga 
cgccgggagg 
tgaggtgaga 
cccctccccc 
ctgctcgcgg 
eg t cgt t acg 
tctcccggca 
gcgtgtgggg 
ggcctcgcgt 
ggtttaagga 
gtagggaagt 
tgccgccgtt 
gccgfcggggg 
tgccggtcgt 
ctccccccca 
catcfcgttct 
cgtcagcacc 
actcggctcg 
acacattgat 
cctgtctgag 

tgggagtttg 

tcagacgtgt 
cccgcgcatc 
cccggtcctg 
ccgtgccccg 
tgctctggtc 
gcggccgcgg 
gtctgtgtgt 
gcggtcgccc 
gagagaggtg 
tccctccctc 
ccgtggcccc 

cggcgccggc 

tgcgagttcg 
gacccggcgt 
cgccgccctc 
ctcccgtgcc 
gctgggcgcg 
atttaagcat 
agtgaacagg 
gcgtacggaa 
gcccagcccg 
ceggagtegg 
aataceggea 
aagagagagt 
ccgcccggag 
ctttcccgct 
cgcgtccggc 
geegggggtg 
cttccaccgt 
aggtggctcg 
acagccctcc 
cgcgctctcc 



10680 
10740 
10800 
10860 
10920 
10980 
11040 
11100 
11160 
11220 
11280 
11340 
11400 
11460 
11520 
11580 
11640 
11700 
11760 
11820 
11880 
11940 
12000 
12060 
12120 
12180 
12240 
12300 
12360 
12420 
12480 
12540 
12600 
12660 
12720 
12780 
12840 
12900 
12960 
13020 
13080 
13140 
13200 
13260 
13320 
13380 
13440 
13500 
13560 
13620 
13680 
13740 
13800 
13860 
13920 
13980 
14040 
14100 
14160 
14220 
14280 
14340 
14400 
14460 
14520 
14580 
14640 



-8- 



ctctcccccc 
ggcgcgaccg 
cggactgtcc 
gtcacgcgtc 
cgacccgtct 
gaaagccgcc 
cgaggcctct 
aggtggagca 
cgaagccaga 
cgacctgggt 
tttccctcag 
aatgattaga 
agaagcccgg 
ttggtaagca 
gacgctcatc 
gaagtcggaa 
aatggatggc 
ggacgggagc 
cccccgcctc 
cgccgcgacg 
tggagccgcc 
aggccgaagt 
agagatgggc 
cgaaagggag 
cagtgcggta 
tttctttgtg 
ttggaaagcg 
ggggagaggg 
gaacagcctc 
acttcgggat 
ctgggcgcgc 
cccgtccttt 
cgtcgtcgcc 
cgcgcggcgc 
accagcggtc 
ctctggacgc 
gctcccgggg 
ccccccatcg 

gggggaacct 

cgcggcgccc 
gaatccgact 
tgtgatttct 
acggcgggag 
cgcgcatgaa 
gccaagggaa 
agtctggcac 
gccccgtcct 
ctactctcat 
cgcttctggc 
agtgccaggt 
taaggcgagc 
atcttgattt 
tttgggtttt 
ccaagcgttc 
gaagcagaat 
tagaccgtcg 
cctgctcagt 
gccaatgggg 
gcccaagcgg 
ccccgtccgt 
ccgccgggcg 
cggccggaaa 
ggcgctaaac 
gctccctcgc 
tttcccgtcg 
gcggtcgcct 
cgtcttctcc 



gtccgcctcc 
ctctcccacc 
ccagtgcgcc 
tcccgacgaa 
tgaaacacgg 
gtggcgcaat 
ccagtccgcc 
cgagcg t acg 
ggaaactctg 
at agggg cga 
gatagctggc 
ggtcttgggg 
ctcgctggcg 
gaactggcgc 
agaccccaga 
tccgctaagg 
gctggagcgt 
ggccgcgggt 
ccctccgcgc 
agtaggaggg 
gcaggtgcag 
ggagaagggt 
gagtgccgtt 
tcgggttcag 
acgcgaccga 
aagggcaggg 
tcgcggttcc 
tgtaaatctc 
tggcatgttg 
aaggattggc 
gccgcggctg 
ccgcccgggc 
acctctcttc 
gggctccggg 
cccggtgggg 
gagccgggcc 
agcccggcgg 
cctctcccga 
ccgcgt egg t 
ccgcctcggc 
gtttaattaa 
gcccagtgct 
taactatgac 
tggatgaacg 
egggcttgge 
ggtgaagaga 
cgcgtcgggg 
cgttttttca 
gccaagcgtc 
ggggagtttg 
t cagggagga 
tcagtacgaa 
aagcaggagg 
atagegaegt 
tcaccaagcg 
tgagacaggt 
acgagaggaa 
cgaagctacc 
aacga t aegg 
cccgctcggc 
tegggacegg 
gggggccgcc 
cattegtaga 
tgegatctat 
cacgcccgct 
cggcccccgc 
tccgtctccc 



egggegggeg 
cccctccgtc 
ccgggcgtcg 
gccgagcgca 
accaaggagt 
gaaggtgaag 
gagggegcac 
cgttaggacc 
gtggaggtcc 
aagactaatc 
gctctcgctc 
ccgaaacgat 
tggagccggg 
tgcgggatga 
aaaggtgttg 
agtgtgtaac 
cgggcccata 
gcgcgtctct 
geegggtteg 
ccgctgcggt 
atcttggtgg 
tccatgtgaa 
ccgaagggac 
atccccgaat 
teceggagaa 
cgccctggaa 
ggcggcgtcc 
gege egggee 
gaacaatgta 
tctaagggct 
gaegaggege 
ccgccctccc 
ccccctcctt 
gcggcgggtc 
egggggge cc 
cttcccgtgg 
gtgccggcgc 
ggtgcgtggc 
gttcccccgc 
cggcgcct ag 
aacaaagcat 
ctgaatgtca 
tctcttaagg 
agattcccac 
ggaatcagcg 
catgagaggt 
teggggcacg 
ctgacccggt 
cgtcccgcgc 
actggggegg 
cagaaacctc 
tacagaccgt 
tgtcagaaaa 
cgctttttga 
ttggattgtt 
tagttttacc 
ccgcaggttc 
atctgtggga 
cagcgccgaa 
ggggtccccg 
ggtccggtgc 
ctctcgcccg 
cgacctgctt 
tgaaagtcag 
cgctcgcacg 
gcggttgccc 
gaggaeggtt 



tgggggtggg 

gcctctctcg 
tcgcgccgtc 
eggggtegge 
etaacgegtg 
ggccccgccc 
caccggcccg 
cgaaagatgg 
gtageggtec 
gaaccatcta 
ccgacgtacg 
ctcaacctat 
cgtggaatgc 
accgaacgcc 
gt tgatat ag 
aactcacctg 
cccggccgtc 

eggggteggg 

cccccgcggc 
gagecttgaa 
tagtagcaaa 
cagcagttga 
gggcgatggc 
ccggagtggc 
geeggeggga 
tgggttcgcc 
ggtgagctct 
gtacccatat 
gg t aagggaa 
gggteggteg 
cgccgccctc 
ctcttccccg 
cttcccgtcg 
caaccccgcg 
ggacactegg 
atcgcctcag 
gggtcccctc 

gggggeggge 

cgggtccgcc 
cagccgactt 
cgcgaaggcc 
aagtgaagaa 
tagecaaatg 
tgtccctacc 
gggaaagaag 
gtagaataag 
ccggcctcgc 
gaggeggggg 
gtgegggegg 
tacacctgtc 
ccgtggagca 
gaaagcgggg 
gttaccacag 
tccttcgatg 
cacccactaa 
ctactgatga 
agacatttgg 
ttatgactga 
ggagectegg 
cgtcgccccg 
ggagagccgt 
tcacgttgaa 
ctgggtcggg 
ccctcgacac 
cgaccgtgtc 
gaacgaccgt 
cgtttctctt 



ggccgggccg 
gggcccggtg 
gggtcccggg 
ggcgafcgtcg 
cgegag t cag 
gggggecega 
tctcgcccgc 
tgaactatgc 
tgacgtgcaa 
gtagctggtt 
cagttttatc 
tctcaaactt 
gagtgectag 
gggttaaggc 
acagcaggac 
ccgaatcaac 
gccgcagtcg 
ggtgcgtggc 
gtcgggcccc 
gectagggeg 
tattcaaacg 
acatgggtca 
ctccgttgcc 
ggagatgggc 
ggcctcgggg 
ccgagagagg 
cgctggccct 
ccgcagcagg 
gteggcaage 
ggctggggcg 
tcccacgtcc 
cggggccccg 
gggggcgggt 
ggggttccgg 

ggggccggcg 
ctgcggcggg 
cccgcggggc 
gggcgtgtcc 
ccccgggccg 
agaactggtg 
cgcggcgggt 
attcaatgaa 
cctcgtcatc 
tactatccag 
accctgttga 
tgggaggece 
gggccgccgg 
gg cgagc c c c 
gcgcgacccg 
aaaeggtaac 
gaagggcaaa 
cctcacgatc 
ggataactgg 
tcggctcttc 
tagggaacgt 
tgtgttgttg 
tgtatgtgct 
acgcctctaa 
ttggccccgg 
cggcggcgcg 
tcgtcttggg 
cgcacgttcg 
gtttcgtacg 
aagggtttgt 
gccgcccggg 
gtggtggttg 
tccccttccg 



cccctcccac 

gggggcgggg 

gggacegteg 

gctacccacc 

gggctcgtcc 

ggtgggatcc 

cgcgccgggg 

ttgggcaggg 

ateggtegtc 

ccctccgaag 

eggtaaageg 

taaatgggta 

tgggecaett 

gcccgatgcc 

ggtggccatg 

tagecctgaa 

gaacggaacg 

gggggee cgt 

geggagecta 

cgggcccggg 

agaactttga 

gtcggtcctg 

ctcggccgat 

gccgcgaggc 

agagttctct 

ggcccgtgcc 

tgaaaat ccg 

tctccaaggt 

eggatcegta 

egaagegggg 

ggggaga ccc 

tcgtcccccg 

egggggtegg 

agegggagga 

geggeggega 

cgtcgcggcc 

ctcgctccac 

cgcgcgtgtg 

cggttttccg 

eggaccaggg 

gttgacgega 

gcgcgggtaa 

taattagtga 

cgaaaccaca 

gcttgactct 

ccggcgcccg 

tgaaatacca 

gaggggctct 

ctccggggac 

gcaggtgtcc 

agetegcttg 

cttctgacct 

cttgtggcgg 

ctatcattgt 

gagctgggtt 

ccatggtaat 

tggctgagga 

gtcagaatcc 

atagcegggt 

gggtctcccc 

aaacggggtg 

tgtggaacct 

tagcagagca ' 

ctctgcgggc 

cgtcacgggg 

ggggggggat 

tcgctctcct 



14700 
14760 
14 82 0 
14880 
14940 
15000 
15060 
15120 
15180 
15240 
15300 
15360 
15420 
15480 
15540 
15600 
15660 
15720 
15780 
15840 
15900 
15960 
1602O 
16080 
16140 
16200 
16260 
16320 
16380 
16440 
16500 
16560 
16620 
16680 
16740 
16800 
16860 
16920 
16980 
17040 
17100 
17160 
17220 
17280 
17340 
17400 
17460 
17520 
17580 
17640 
17700 
17760 
17820 
17880 
17940 
18000 
18060 
18120 
18180 
18240 
18300 
18360 
18420 
18480 
18540 
18600 
18660 
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tgggtgtggg 
cttgccctcc 
ccgcggcgcg 
gttggagggg 
gccggggggg 
gcggcggcga 
gcggcgcctc 
actttgtttt 
cccccccccc 

tttttttttt 

catataggtc 
gaccagatat 
tttttttttt 
atatacttat 
tccaccgatg 
ccgcgacgcg 
tggaacctta 
tactttgtct 
ataaattatc 
tttgtgttgt 
tttgtgttgt 
gttgggttgg 
ttgtttgctg 
tacacaaaca 
tatccctttc 
tgtgtgtgtg 
tacttataat 
gcagacttct 
aataaataca 
gttgaccagt 
aatagataga 
attaaccact 
atttgaactc 
tggtctgtct 
tgcttttttt 
tctgtagacc 
caattttgga 
attttattat 
attagttgga 
ttgtggggct 
gatttttgta 
tttcattgct 
ccagttcctc 
gatgtgctag 
aaaagttcta 
tgttctcact 
agacatatat 
ttcccagacg 
caccacaact 
ggaaaagcat 
ggttgtgaac 
agtcagggct 
tgaatgatcc 
ataaaataat 
ctcacagcac 
gaggggtggg 
atggcctggt 
tacctgaagt 



agcctcgtgc 
ggccttggcc 
gtgacgcacg 
cgggaggggt 
cgctctctcc 
cgtgcgtacg 
ttccattttt 
ttttttttcc 
ccccccggcg 
ttaaattcct 
gaccagtact 
ccgaaagtcc 
tttggtgtgc 
aggaggaggt 
atggaggtcg 
gcgggctcac 
aggtcgacca 
ttttctgaaa 
tgatctagat 
tttgttttgt 
gttgtgttgt 
gttgggttgt 
ttgttttgtg 
tgcacttttt 
cttctctctc 
tgcgtgtgtg 
aataggtcgc 
gagttcgagg 
tacatacata 
tgtcaatcct 
tgga t agagt 
tttccctttt 
aggaccctgg 
gctgtttgtt 
tttcttctga 
agcctggcct 
gtaaaggtgt 
tagacagaac 
ccaattagtt 
ggggatcagg 
aagattactt 
tcatttctat 
ctgccttctg 
tgaaccagag 
acaaagtgat 
ctgccaccaa 
tttttctttt 
gccttttgag 
ctaacctgtt 
gtagcagttg 
cacccaccat 
ctaaaccgat 
cagcatggga 
gaaatgaatg 
ctccccctcc 
gtgggggcag 
tctctgaact 
ccctgagtga 



cgtcgcgacc 
aagccggagg 
gtgggatccc 
ttttcccgtg 
gcccgagcat 
aggggaggat 
tcccccccaa 
cccgatgctg 
cggagcggcg 
ggaaccttta 
ccgggtggta 
tctctttccc 
ctctttttga 
cgaccagtac 
accagatgtc 
tctggactct 
gttgtccgtc 
atcgcagagg 
ttgtttttct 
tttgttttgt 
gttgtgttgg 
gttgttfcggt 
ttttgcgggt 
ttaaaataaa 
ttttttaaaa 
tgtgtgtgtg 
cgggtggtgg 
ccagcctggt 
catacataca 
ttagaatttt 
gatacaaata 
taggtttttt 
caggtcaact 
tgtttgcttg 
gacagtattt 
caatcgaact 
gctacaccac 
gaaatcaact 
ggctggtttg 
tatctcaacg 
ttcttagtct 
ttctctttct 
gaagatgtag 
agtttggatg 
ctttaacttt 
cgcgctttgt 
ggttttgctt 
aataaaatgg 
tggctgtttt 
taggacacac 
gtggttgcct 
gagccatctc 
agacagtctg 
aagtctccac 
cccacactgc 
ggatctgcat 
gttgagcctt 
tgatttccct 



gcggcctgcc 
gcggaggagg 
catcctcggc 
aacgccgcgt 
cccca.ct.ccc 
gtcgcggtgt 
cttcggaggt 
gaggtcgacc 
gggccactct 
ggtcgaccag 
ctttgtcttt 
tttactcttc 
cttatataca 
tccgggcgac 
cgaaagtgtc 
tttttttttt 
tttcactcat 
tcgaccagat 
gtttttcagt 
tttgttttgt 
gttgggttgg 
tttgtgttgt 
cgaacagttg 
tttttaaaat 
attttctttg 
cgtgcagcgt 
tagcttcccg 
ctacagagga 
tacatacata 
gtttttaatt 
taggtttttt 
tttttttccc 
ggaaaacgtg 
cttgcttgct 
ctctgtgtaa 
cagaaatcct 
tgcctggcat 
agttggtcct 
ggaggtttct 
gaatgcatga 
gaggaaaaaa 
ttctttcttt 
gcattgcatt 
tcaagccgta 
tttttttttt 
acattgaatg 
gacatggttt 
gaggccagaa 
ccttcccaag 
tagacgagag 
gggatttgaa 
tccagccctc 
ccctctttgt 
gtatttattt 
ctttctccct 
gtcttcttgc 
gtctatccag 
gtgaattc 



gtcgcctgcc 
gggatcggcg 
gcgtccgtcg 
tcggcgccag 
gcccctcctc 
ggaggcggag 
cgaccagtac 
agatgtccga 
ggactctttt 
ttgtccgtct 
ttctgaaaat 
cccacagcga 
tgtaaatagt 
actttgtttt 
ccgtcccccc 
tttttttttt 
tcatataggt 
gtcagaaagt 
tttgtgttgt 
tttgttttgt 
gttgggttgg 
ttggtgttgt 
tccctaaccg 
aaatgcgaaa 
tgtgtgtgtg 
gc g cgcgc t c 
gactccagag 
accctgtctc 
catacataca 
aatgtgatag 
tttcagtaaa 
ctgtccatgt 
ttttctatat 
tgcttgcttg 
cctggtgccc 
cctgcctctt 
tattatcatt 
gtttcgttaa 
tttgtttccg 
aggttaaggt 
taaaataata 
ctttcagata 
gggaaaagca 
taatgtttat 
tttctccttc 
tgagctttgt 
ccctttctat 
ccaaagtctt 
gcacagatct 
caccagatct 
ctcaggatct 
ctacattcct 
ggtatatcac 
cttcgagcta 
atgtttgggt 
aggtctgtga 
aggctgactg 



gccgcagccc 
gcggcggcga 
gggacggccg 
gcctctggcg 
ttcgcgcgcc 
agggtccggc 
tccgggcgac 
aagtgtcccc 
tttttttttt 
tttactcctt 
cccagaggtc 
ttctcttttt 
gtgtacgttt 
tttttttttt 
cctccccccc 
tttaaatttc 
cgaccggtgg 
ctggtggtcg 
tttgtgttgt 
tttgttttgt 
gttgggttgg 
tggttttgtt 
agtttttttg 
atcgaccaat 
tgtgtgtgtg 
gttttataaa 
gcagaggcag 
gaaaaatgaa 
tacatatgag 
agagatagat 
tatgaggttg 
ggttgctggg 
atataaatag 
cttgcttgct 
tgaaactcac 
gtctacctcc 
atcattatta 
ttcatttgaa 
atttgggtgt 
gagatggctc 
ttgggctacg 
aggaggtcgg 
ttgtttgaga 
tacaatatag 
tacttctact 
tttgcttaac 
ccgtgcaggg 
ttgaataaag 
ttcccagcat 
cattgtgggt 
t c agaagacg 
tcttaaggca 
catatactca 
tctaaattct 
ggggctgggg 
actatttgcg 
gctagttttc 



18720 
18780 
18840 
18900 
18960 
19020 
19080 
19140 
19200 
±9260 
19320 
19380 
19440 
19500 
19560 
19620 
19680 
19740 
19800 
19860 
19920 
19980 
20040 
20100 
20160 
20220 
20280 
20340 
20400 
20460 
20520 
20580 
20640 
20700 
20760 
20820 
20880 
20940 
21000 
2X060 
21120 
21180 
21240 
21300 
21360 
21420 
21480 
21540 
21600 
21660 
21720 
21780 
21840 
21900 
21960 
22020 
22080 
22118 



<210> 19 
<211> 175 
<212> DNA 

<213> Mus musculus 



<400> 19 

ctcccgcgcg gcccccgtgt tcgccgttcc cgtggcgcgg acaatgcggt tgtgcgtcca 
cgtgtgcgtg tccgtgcagt gccgttgtgg agtgcctcgc tctcctcctc ctccccggca 



60 
120 



-10- 

gcgtfccccac ggfctggggac caccggtgac ctcgccctct tcgggcctgg atccg 175 

<210> 20 

<211> 755 

<212> DNA 

<213> Mus musculus 

<400> 20 

ggtctggtgg gaattgttga cctcgctctc gggtgcggcc tttggggaac ggcggggtcg 60 

gtcgtgcccg gcgccggacg tgtgtcgggg cccacttccc gctcgagggt ggcggtggcg 12 0 

gcggcgttgg tagtctcccg tgttgcgtct tcccgggctc ttgggggggg tgccgtcgtt 180 

ttcggggccg gcgttgcttg gcttacgcag gcttggtttg ggactgcctc aggagtcgtg 240 

ggcggtgtga ttcccgccgg ttttgcctcg cgtctgcctg ctfctgcctcg ggtttgcttg 3 00 

gttcgtgtct cgggagcggt ggtttttttt tttfctcgggt cccggggaga ggggtttttc 3 60 

cgggggacgt tcccgtcgcc ccctgccgcc ggtgggtfctt cgtttcgggc tgtgttcgtt 420 

tccccttccc cgtttcgccg tcggttctcc ccggtcggtc ggccctctcc ccggtcggtc 480 

gcccggccgt gctgccggac ccccccttct ggggggg^tg cccgggcacg cacgcgtccg 54 0 

ggcggccact gtggtccggg agctgctcgg caggcgggtg agccagttgg aggggcgtca 600 

tgcccccgcg ggctcccgtg gccgacgcgg cgtgttcttt gggggggect gtgcgtgcgg 660 

gaaggctgcg cacgttgtcg gtccttgcga gggaaagagg cttttttttt ttagggggtc 72 0 

gtccttcgtc gtcccgtcgg cggtggatcc ggcct 755 

<210> 21 

<211> 463 

<212> DNA 

<213> Mus musculus 

<400> 21 

ggccgaggtg cgtctgcggg ttggggctcg tccggccccg tcgtcctccg ggaaggcgtt 60 
tagcgggtac cgtcgccgcg ccgaggtggg cgcacgtcgg tgagataacc ccgagcgtgt 120 
ttctggttgt tggcggcggg ggctccggtc gatgtcttcc cctccccctc tccccgaggc 180 
caggtcagcc tccgcctgtg ggcttcgtcg gccgtctccc -cccccctcac gtccctcgcg 
agcgagcccg tccgttcgac cttccttccg ccttcccccc atctttccgc gctccgttgg 
ccccggggtt ttcacggcgc cccccacgct cctccgcctc tccgcccgtg gtttggacgc 
ctggttccgg tctccccgcc aaaccccggt tgggttggtc tccggccccg gcttgctctt 42 0 

cgggtctccc aacccccggc cggaagggtt cgggggttcc ggg 463 

<210> 22 

<211> 378 

<212> DNA 

<213> Mus musculus 

<400> 22 

ggattcttca ggattgaaac ccaaaccggt tcagtttcct ttccggctcc ggccgggggg 60 

ggcggccccg ggcggtttgg tgagttagat aacctcgggc cgatcgcacg ccccccgtgg 12 0 

cggcgacgac ccattcgaac gtctgcccta tcaactttcg atggtagtcg atgtgcctac 180 

catggtgacc acgggtgacg gggaatcagg gttcgattcc ggagagggag cctgagaaac 24 0 

ggctaccaca tccaaggaag gcagcaggcg cgcaaattac ccactcccga cccggggagg 3 00 

tagtgacgaa aaataacaat acaggactct ttcgaggccc tgtaattgga atgagtccac 
tttaaatcct ttaagcag 

<210> 23 

<211> 378 

<212> DNA 

<213> Mus musculus 

<400> 23 

gatccattgg agggcaagtc tggtgccagc agccgcggta attccagctc caatagcgta 60 

tattaaagtt gctgcagtta aaaagctcgt agttggatct tgggagcggg cgggcggtcc 120 

gccgcgaggc gagtcaccgc ccgtccccgc cccttgcctc tcggcgcccc ctcgatgctc 180 

ttagctgagt tgtcccgcgg ggcccgaagc gtttactttg aaaaaattag agttgtttca 240 

aagcaggccc gagccgcctg gataccgcca gctaggaaat aatggaatag gaccgcggtt 3 00 



240 
300 
360 



360 
378 



cctattttgt ttggttttcg gaactgagcc catgattaag ggaaacggcc gggggcattc 3 60 

..... "3*70 



ccttattgcg ccccccta 



378 



<210> 24 
<211> 719 
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<212> DNA 

<213> Mus mus cuius 



<400> 24 

ggatctttcc 

tcccctctcc 

gggggaccgc 

gccgcgaccg 
ggcgcgtctc 
tttcgcggaa 
ccccgtccgc 
ggtttctctc 
ggactgtcct 
tcacgccgcc 
gacccgtctt 
aaagccgccg 



cgctccccgt 
ggaggggggg 
ccccggccgg 
gctacgagac 
agggcgcgcc 
tcccggggcc 
ctcccgggcg 
tctcccggtc 
cagtgcgccc 
cccgacgaag 
gaaacacgga 
tggcgcaatg 



tcctcccggc ccctccaccc gcgcgtctcc ccccttcttt 

gaggtggggg cgcgtgggcg gggtcggggg tggggtcggc 

caaaaggccg ccgccgggcg cacttcaacc gtagcggtgc 

ggctgggaag gcccgacggg gaatgtggct cggggggggc 

gaaccacctc accccgagtg ttacagccct ccggccgcgc 

gaggggaagc ccgatacccg tcgccgcgct tttcccctcc 

ggcgtggggg tgggggccgg gccgcccctc ccacgcccgt 

tcggccggtt tggggggggg agcccggttg ggggcggggc 

cgggcgtcgt cgcgccgtcg ggcccggggg gttctctcgg 

ccgagcgcac ggggtcggcg gcgatgtcgg ctacccaccc 

ccaaggagtc taacgcgtgc gcgagtcagg ggctcgcacg 
aaggtgaagg gccccgtccg ggggcccgag gtgggatcc 



<210> 25 

<211> 685 

<212> DNA 

<213> Mus musculus 



<400> 25 

cgaggcctct 

aggtggagca 

cgaagccaga 

cgacctgggt 

tttccctcag 

cggaatggat 

atgggtaagg 

ggccactttt 

ccgatgccga 

tggccatgga 

gccctgaaaa 

aacgggacgg 



ccagtccgcc 
cgagcgtacg 
ggaaactctg 
ataggggcga 
gatagctggc 
taggaggtct 
aagcccggct 
ggtaagcaga 
cgctcatcag 
agt cggaat c 
tggatggcgc 
gacgggagcg 



gagggcgcac 
cgttaggacc 
gtggaggtcc 
aagactaatc 
gctctcgcaa 
tggggccgga 
cgctggcgtg 
actggcgctg 
accccagaaa 
cgctaaggag 
tggagcgtcg 
gccgc 



caccggcccg 
cgaaagatgg 
gtagcggtcc 
gaaccatcta 
ccttcggaag 
aacgatctca 
gagccgggcg 
cgggatgaac 

aggtgttggt 

tgtgtaacaa 
ggcccatacc 



tctcgcccgc 
tgaactatgc 
tgacgtgcaa 
gtagctggtt 
cagttttatc 
aactatttct 
tggaatgcga 
cgaacgccgg 
tgatatagac 
ctcacctgcc 
cggccgtcgc 



cgcgtcgggg 
ctgggcaggg 
atcggtcgtc 
ccctccgaag 
cgggt aaagg 
caaactttaa 
gtgcctagtg 
gt t aaggcgc 
agcaggacgg 
gaatcaacta 
cggcagtcgg 



<210> 26 

<211> 5162 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Chimeric bacterial plasraid 



<400> 26 

gacggatcgg 

ccgcatagtt 

cgagcaaaat 

ttagggttag 

gattattgac 

tggagttccg 

cccgcccatt 

attgacgtca 

atcatatgcc 

atgcccagta 

tcgctattac 

actcacgggg 

aaaatcaacg 

gtaggcgtgt 

ctgcttactg 

gage t egg at 

cgccagtgtg 

ctctcaaaag 

ttgatattca 

aagacaatct 

tgagtgacaa 

tgeaggtega 



gagatctccc 
aagecagtat 
ttaagctaca 
gcgttttgcg 
tagt tattaa 
cgtt acataa 
gaegtcaata 
atgggtggac 
aagt acgccc 
catgacctta 
catggtgatg 
atttccaagt 
ggact ttcca 
acggtgggag 
gctt atcgaa 
cgat atetge 
ctggaattaa 
egggcatgae 
cctggcccgc 
ttttgttgtc 
tgacatccac 
geatgeatet 



gatcccctat 
ctgctccctg 
acaaggcaag 
ctgcttcgcg 
tagtaatcaa 
ettaeggtaa 
atgacgtatg 
tatttaeggt 
cctattgacg 
tgggactttc 
cggttttggc 
ctccacccca 
aaatgtcgta 
gtctatataa 
attaatacga 
ggccgcgtcg 
ttcgctgtct 
ttctgegcta 
ggtgatgcct 
aagcttgagg 
tttgecttte 
agggeggeca 



ggtcgactct 
cttgtgtgtt 
gcttgaccga 
atgtacgggc 
ttacggggtc 
atggcccgcc 
ttcccatagt 
aaactgccca 
teaatgaegg 
ctacttggca 
agtacatcaa 
ttgacgtcaa 
acaactccgc 
gcagagctct 
ctcactatag 
aeggaattea 
gegagggeca 
agattgtcag 
ttgagggtgg 
tgtggcaggc 
tctccacagg 
attccgcccc 



cagtacaatc 
ggaggtcget 
caattgeatg 
cagatatacg 
attagttcat 
tggctgaccg 
aacgecaata 
cttggcagta 
taaatggccc 
gtacatctac 
tgggcgtgga 
tgggagtttg 
cccattgacg 
ctggctaact 
ggagacccaa 
gtggatccac 
gctgttgggg 
tttccaaaaa 
ccgcgtccat 
ttgagatctg 
tgtccactcc 
tctccctccc 



tgctctgatg 
gagtagtgcg 
aagaatctgc 
cgttgacatt 
ageccatata 
cccaacgacc 
gggactttcc 
catcaagtgt 
gectggcatt 
gtattagtca 
tagcggtttg 
ttttggcacc 
caaatgggcg 
agagaaccca 
gcttggtacc 
t agt aaegge 
tgagtactcc 
cgaggaggat 
ctggtcagaa 
gccatacact 
caggtccaac 
ccccccctaa 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
719 



60 
120 
180 
240 
30O 
360 
420 
480 
540 
600 
660 
685 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
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cgttactggc cgaagccgct tgga.ata.agg ccggtgtgcg tttgtctata tgtgattttc 13 80 

caccatattg ccgtcttttg gcaatgtgag ggcccggaaa cctggccctg tcttcttgac 1440 

gagcattcct aggggtcttt cccctctcgc caaaggaatg caaggtctgt tgaatgtcgt 1500 

gaaggaagca gttcctctgg aagcttcttg aagacaaaca acgtctgtag cgaccctttg 1560 

caggcagcgg aaccccccac ctggcgacag gtgcctctgc ggccaaaagc cacgtgtata 1620 

agatacacct gcaaaggcgg cacaacccca gtgccacgtt gtgagttgga tagttgtgga 1680 

aagagtcaaa tggctctcct caagcgtatt caacaagggg ctgaaggatg cccagaaggt 174 O 

accccattgt atgggatctg atctggggcc tcggtgcaca tgctttacat gtgtttagtc 180 0 

gaggttaaaa aaacgtctag gccccccgaa ccacggggac gtggttttcc tttgaaaaac 1860 

acgatgataa gcttgccaca acccgggatc caccggtcgc caccatggtg agcaagggcg 1920 

aggagctgtt caccggggtg gtgcccatcc tggtcgagct ggacggcgac gtaaacggcc 1980 

acaagttcag cgtgtccggc gagggcgagg gcgatgccac ctacggcaag ctgaccctga 204 0 

agttcatctg caccaccggc aagctgcccg tgccctggcc caccctcgtg accaccctga 2100 

cctacggcgt gcagtgcttc agccgctacc ccgaccacat gaagcagcac gacttcttca 2160 

agtccgccat gcccgaaggc tacgtccagg agcgcaccat cttcttcaag gacgacggca 2220 

actacaagac ccgcgccgag gtgaagttcg agggcgacac cctggtgaac cgcatcgagc 2280 

tgaagggcat cgacttcaag gaggacggca acatcctggg gcacaagctg gagtacaact 2 340 

acaacagcca caacgtctat atcatggccg acaagcagaa gaacggcatc aaggtgaact 2400 

tcaagatccg ccacaacatc gaggacggca gcgtgcagct cgccgaccac taccagcaga 2460 

acacccccat cggcgacggc cccgtgctgc tgcccgacaa ccactacctg agcacccagt 2520 

ccgccctgag caaagacccc aacgagaagc, gcgatcacat ggtcctgctg gagttcgtga 2580 

ccgccgccgg gatcactctc ggcatggacg agctgtacaa gtaaagcggc cctagagctc 2640 

gctgatcagc ctcgactgtg cctctagttg ccagccatct gttgtttgcc cctcccccgt 2700 

gccttccttg accctggaag gtgccactcc cactgtcctt tcctaataaa atgaggaaat 2760 

tgcatcgcat tgtctgagta ggtgtcattc tattctgggg ggtggggtgg ggcaggacag 2820 

caagggggag gattgggaag acaatagcag gcatgctggg gatgcggtgg gctctatggc 2880 

ttctgaggcg gaaagaacca gctggggctc gagtgcattc tagttgtggt ttgtccaaac 2940 

tcatcaatgt atcttatcat gtctgtatac cgtcgacctc tagctagagc ttggcgtaat 3000 

catggtcata gctgtttcct gtgtgaaatt gttatccgct cacaattcca cacaacatac 3060 

gagccggaag cataaagtgt aaagcctggg gtgcctaatg agtgagctaa ctcacattaa 3120 

ttgcgttgcg ctcactgccc gctttccagt cgggaaacct gtcgtgccag ctgcattaat 3180 

gaatcggcca acgcgcgggg agaggcggtt tgcgtattgg gcgctcttcc gcttcctcgc 3240 

tcactgactc gctgcgctcg gtcgttcggc tgcggcgagc ggtatcagct cactcaaagg 33 00 

cggtaatacg gttatccaca gaatcagggg ataacgcagg aaagaacatg tgagcaaaag 33 60 

gccagcaaaa ggccaggaac cgtaaaaagg ccgcgttgct ggcgtttttc cataggctcc 3420 

gcccccctga cgagcatcac aaaaatcgac gctcaagtca gaggtggcga aacccgacag 3480 

gactataaag ataccaggcg tttccccctg gaagctccct cgtgcgctct cctgttccga 3540 

ccctgccgct taccggatac ctgtccgcct ttctcccttc gggaagcgtg gcgctttctc 3 600 

aatgctcacg ctgtaggtat ctcagttcgg tgtaggtcgt tcgctccaag ctgggctgtg 3 660 

tgcacgaacc ccccgttcag cccgaccgct gcgccttatc cggtaactat cgtcttgagt 3720 

ccaacccggt aagacacgac ttatcgccac tggcagcagc cactggtaac aggattagca 3 7 80 

gagcgaggta tgtaggcggt gctacagagt tcttgaagtg gtggcctaac tacggctaca 3840 

ctagaaggac agtatttggt atctgcgctc tgctgaagcc agttaccttc ggaaaaagag 3 900 

ttggtagctc ttgatccggc aaacaaacca ccgctggtag cggtggtttt tttgtttgca 3 960 

agcagcagat tacgcgcaga aaaaaaggat ctcaagaaga tcctttgatc ttttctacgg 4 020 

ggtctgacgc tcagtggaac gaaaactcac gttaagggat tttggtcatg agattatcaa 4 0 80 

aaaggatctt cacctagatc cttttaaatt aaaaatgaag ttttaaatca atctaaagta 4140 

tatatgagta aacttggtct gacagttacc aatgcttaat cagtgaggca cctatctcag 4200 

cgatctgtct atttcgttca tccatagttg cctgactccc cgtcgtgtag ataactacga 4260 

tacgggaggg cttaccatct ggccccagtg ctgcaatgat accgcgagac ccacgctcac 432 0 

cggctccaga tttatcagca ataaaccagc cagccggaag ggccgagcgc agaagtggtc 4380 

ctgcaacttt atccgcctcc atccagtcta ttaattgttg ccgggaagct agagtaagta 4440 

gttcgccagt taatagtttg cgcaacgttg ttgccattgc tacaggcatc gtggtgtcac 4500 

gctcgtcgtt tggtatggct tcattcagct ccggttccca acgatcaagg cgagttacat 4 5 60 

gatcccccat gttgtgcaaa aaagcggtta gctccttcgg tcctccgatc gttgtcagaa 4620 

gtaagttggc cgcagtgtta tcactcatgg ttatggcagc actgcataat tctcttactg 4680 

tcatgccatc cgtaagatgc ttttctgtga ctggtgagta ctcaaccaag tcattctgag 4740 

aatagtgtat gcggcgaccg agttgctctt gcccggcgtc aatacgggat aataccgcgc 4 800 

cacatagcag aactttaaaa gtgctcatca ttggaaaacg ttcttcgggg cgaaaactct 4 860 

caaggatctt accgctgttg agatccagtt cgatgtaacc cactcgtgca cccaactgat 4920 

cttcagcatc ttttactttc accagcgttt ctgggtgagc aaaaacagga aggcaaaatg 4980 

ccgcaaaaaa gggaataagg gcgacacgga aatgttgaat actcatactc ttcctttttc 5040 

aatattattg aagcatttat cagggttatt gtctcatgag cggatacata tttgaatgta . 5100 

tttagaaaaa taaacaaata ggggttccgc gcacatttcc ccgaaaagtg ccacctgacg 5160 

tc 5162 



<210> 27 
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<211> 5627 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pMG plasmid from InvivoGen ; IRES sequence modified 
EMCV nucleotides 2736-3308 



<400> 27 

caccggcgaa ggaggcctag atctatcgat tgtacagcta gctcgacatg ataagataca 6 0 

ttgatgagtt tggacaaacc acaactagaa tgcagtgaaa aaaatgcttt atttgtgaaa 120 

tttgtgatgc tattgcttta tttgtgaaat ttgtgatgct attgctttat ttgtaaccat 180 

tataagctgc aataaacaag ttaacaacaa caattgcatt cattttatgt ttcaggttca 240 

99999 a 99 fc 9 tgggaggttt tttaaagcaa gtaaaacctc tacaaatgtg gtagatccat 3 00 

ttaaatgtta attaagaaca tgtgagcaaa aggccagcaa aaggccagga accgtaaaaa 3 60 

ggccgcgttg ctggcgtttt tccataggct ccgcccccct gacgagcatc acaaaaatcg 420 

acgctcaagt cagaggtggc gaaacccgac aggactataa agataccagg cgtttccccc 480 

tggaagctcc ctcgtgcgct ctcctgttcc gaccctgccg cttaccggat acctgtccgc 54 0 

ctttctccct tcgggaagcg tggcgctttc tcatagctca cgctgtaggt atctcagttc 600 

ggtgfcaggtc gttcgctcca agctgggctg tgtgcacgaa ccccccgttc agcccgaccg 660 

ctgcgcctta tccggtaact atcgtcttga gtccaacccg gtaagacacg acttatcgcc 72 0 

actggcagca gccactggta acaggattag cagagcgagg tatgtaggcg gtgctacaga 7 80 

gttcttgaag tggtggccta actacggcta cactagaaga acagtatttg gtatctgcgc 840 

tctgctgaag ccagttacct tcggaaaaag agttggtagc tcttgatccg gcaaacaaac 900 

caccgctggt agcggtggtt tttttgtttg caagcagcag attacgcgca gaaaaaaagg 96 0 

atctcaagaa gatcctttga tcttttctac ggggtctgac gctcagtgga acgaaaactc 1 1020 

acgttaaggg attttggtca tggcta'gtta attaagctgc aataaacaat cattattttc 10 80 

attggatctg tgtgttggtt ttttgtgtgg gcttggggga gggggaggcc agaatgactc 114 0 

caagagctac aggaaggcag gtcagagacc ccactggaca aacagtggct ggactctgca 12 0 0 

ccataacaca caatcaacag gggagtgagc tggatcgagc tagagtccgt tacataactt 12 6 0 

acggtaaatg gcccgcctgg ctgaccgccc aacgaccccc gcccattgac gtcaataatg 132 0 

acgtatgttc ccatagtaac gccaataggg actttccatt gacgtcaatg ggtggagtat 13 80 

ttacggtaaa ctgcccactt ggcagtacat caagtgtatc atatgccaag tacgccccct 1440 

attgacgtca atgacggtaa atggcccgcc tggcattatg cccagtacat gaccttatgg 15 00 

gactttccta cttggcagta catctacgta ttagtcatcg ctattaccat ggtgatgcgg 1560 

ttttggcagt acatcaatgg gcgtggatag cggtttgact cacggggatt tccaagtctc 162 0 

caccccattg acgtcaatgg gagtttgttt tggcaccaaa atcaacggga ctttccaaaa 1680 

tgtcgtaaca actccgcccc attgacgcaa atgggcggta ggcgtgtacg gtgggaggtc 1740 

tatataagca gagctcgttt agtgaaccgt cagatcgcct ggagacgcca tccacgctgt 18 00 

tttgacctcc atagaagaca cc^ggaccga tccagcctcc gcggccggga acggtgcatt 18 60 

ggaacgcgga ttccccgtgc caagagtgac gtaagtaccg cctatagagt ctataggccc 192 0 

acccccttgg cttcttatgc atgctatact gtttttggct tggggtctat acacccccgc 1980 

ttcctcatgt tataggtgat ggtatagctt agcctatagg tgtgggttat tgaccattat 2040 

tgaccactcc cctattggtg acgatacttt ccattactaa tccataacat ggctctttgc 2100 

cacaactctc tttattggct atatgccaat acactgtcct tcagagactg acacggactc 2160 

tgtattttta caggatgggg tctcatttat tatttacaaa ttcacatata caacaccacc 2220 

gtccccagtg cccgcagttt ttattaaaca taacgtggga tctccacgcg aatctcgggt 2280 

acgtgttccg gacatgggct cttctccggt agcggcggag cttctacatc cgagccctgc 234 0 

tcccatgcct ccagcgactc atggtcgctc ggcagctcct tgctcctaac agtggaggcc 2400 

agacttaggc acagcacgat gcccaccacc accagtgtgc cgcacaaggc cgtggcggta 2460 

gggtatgtgt ctgaaaatga gctcggggag cgggcttgca ccgctgacgc atttggaaga 2520 

cttaaggcag cggcagaaga agatgcaggc agctgagttg ttgtgttctg ataagagtca 2580 

gaggtaactc ccgttgcggt gctgttaacg gtggagggca gtgtagtctg agcagtactc 264 0 

gttgctgccg cgcgcgccac cagacataat agctgacaga ctaacagact gttcctttcc 2700 

atgggtcttt tctgcagtca cccgggggat ccttcgaacg tagctctaga ttgagtcgac 2760 

gttactggcc gaagccgctt ggaataaggc cggtgtgcgt ttgtctatat gttattttcc 2820 

accatattgc cgtcttttgg caatgtgagg gcccggaaac ctggccctgt cttcttgacg 2880 

agcattccta ggggtctttc ccctctcgcc aaaggaatgc aaggtctgtt gaatgtcgtg 294 0 

aaggaagcag ttcctctgga agcttcttga agacaaacaa cgtctgtagc gaccctttgc 300 0 

aggcagcgga accccccacc tggcgacagg tgcctctgcg gccaaaagcc acgtgtataa 3060 

gatacacctg caaaggcggc acaaccccag tgccacgttg tgagttggat agttgtggaa 312 0 

agagtcaaat ggctctcctc aagcgtattc aacaaggggc tgaaggatgc ccagaaggta 3180 

ccccattgta tgggatctga tctggggcct cggtgcacat gctttacatg tgtttagtcg 324 0 

aggttaaaaa aacgtctagg ccccccgaac cacggggacg tggttttcct ttgaaaaaca 33 0 0 

cgataatacc atgggtaagt gatatctact agttgtgacc ggcgcctagt gttgacaatt 3360 

aatcatcggc atagtatatc ggcatagtat aatacgactc actataggag ggccaccatg 342 0 

tcgactacta accttcttct ctttcctaca gctgagatca ccggtaggag ggccatcatg 3480 
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aaaaagcctg aactcaccgc gacgtctgtc gcgaagtttc tgatcgaaaa gttcgacagc 3540 

gtctccgacc tgatgcagct ctcggagggc gaagaatctc gtgctttcag cttcgatgta 360 0 

ggagggcgtg gatatgtcct gcgggtaaat agctgcgccg atggtttcta caaagatcgt 3660 

tatgtttatc ggcactttgc atcggccgcg ctcccgattc cggaagtgct tgacattggg 3 72 0 

gaattcagcg agagcctgac ctattgcatc tcccgccgtg cacagggtgt cacgttgcaa 37BO 

gacctgcctg aaaccgaact gcccgctgtt ctgcaacccg tcgcggagct catggatgcg 3 84 0 

atcgctgcgg ccgatcttag ccagacgagc gggttcggcc cattcggacc gcaaggaatc 3900 

ggtcaataca ctacatggcg tgatttcata tgcgcgattg ctgatcccca tgtgtatcac 3960 

tggcaaactg tgatggacga caccgtcagt gcgtccgtcg cgcaggctct cgatgagctg 4 02 0 

atgcfcttggg ccgaggactg ccccgaagtc cggcacctcg tgcacgcgga tttcggctcc 4080 

aacaatgtcc tgacggacaa tggccgcata acagcggtca ttgactggag cgaggcgatg 4140 

ttcggggatt cccaatacga ggtcgccaac atcttcttct ggaggccgtg gttggcttgt 4200 

atggagcagc agacgcgcta cttcgagcgg aggcatccgg agcttgcagg atcgccgcgg 4260 

ctccgggcgt atatgctccg cafctggtctt gaccaactct atcagagctt ggfctgacggc 4320 

aatttcgatg atgcagcttg ggcgcagggt cgatgcgacg caatcgtccg atccggagcc 4380 

gggacfcgfccg ggcgtacaca aatcgcccgc agaagcgcgg ccgtctggac cgatggctgt 444 0 

gtagaagtac tcgccgatag tggaaaccga cgccccagca ctcgtccgag ggcaaaggaa 4 f^° 
tgagtcgaga attcgctaga gggccctatt ctatagtgtc acctaaatgc tagagctcgc 

tgatcagcct cgactgtgcc ttctagttgc cagccatctg ttgtttgccc ctcccccgtg 462 0 

ccttccttga ccctggaagg tgccactccc actgtccttt cctaataaaa tgaggaaatt 4680 

gcatcgcatt gtctgagtag gtgtcattct attctggggg gtggggtggg gcaggacagc 474 0 

aagggqgagg attgggaaga caatagcagg catgcgcagg gcccaattgc tcgagcggcc ARnn 
gcaataaaat atctttattt tcattacatc tgtgtgttgg ttttttgtgt gaatcgtaac 
taacatacgc tctccatcaa aacaaaacga aacaaaacaa actagcaaaa taggctgtcc 

ccagtgcaag tgcaggtgcc agaacatttc tctatcgaag gafcctgcgat cgctccggtg 4 98 0 

cccgtcagtg ggcagagcgc acafccgccca cagtccccga gaagttgggg ggaggggtcg f? 4 ° 

gcaattgaac cggtgcctag agaaggtggc gcggggtaaa ctgggaaagt gatgtcgtgt 5100 

actggctccg cctttttccc gagggtgggg gagaaccgta tataagtgca gtagtcgccg 5160 
tgaacgttct ttttcgcaac gggtttgccg ccagaacaca gctgaagctt cgaggggctc 

gcatctctcc ttcacgcgcc cgccgcccta cctgaggccg ccatccacgc cggttgagfcc 5 ?®„ 

gcgttctgcc gcctcccgcc tgtggtgcct cctgaactgc gtccgccgtc taggtaagtt 534 0 

taaagctcag gtcgagaccg ggcctttgtc cggcgctccc ttggagccta cctagactca 540 0 

gccggctctc cacgctttgc ctgaccctgc ttgctcaact ctacgtcttt gtttcgttfct 54 ^° 

ctgttctgcg ccgttacaga tccaagctgt gaccggcgcc tacgtaagtg atatctacta f f 2 ? 

gatttatcaa aaagagtgtt gacttgtgag cgctcacaat tgatacttag attcatcgag <^« n 
agggacacgt cgactactaa ccttcttctc tttcctacag ctgagat 



4800 
4860 
4920 



5580 
5627 



<210> 28 
<211> 553 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pMG plasmid. from InvivoGen : EMCV IRES sequence 

<40O> 28 *ro 

aacgttactg gccgaagccg cttggaataa ggccggtgtg cgtttgtcta tatgttattt e>u 

tccaccatat tgccgtcttt tggcaatgtg agggcccgga aacctggccc tgtcttcttg 120 

acgagcattc ctaggggtct ttcccctctc gccaaaggaa tgcaaggtct gttgaatgtc 180 

gtgaaggaag cagttcctct ggaagcttct tgaagacaaa caacgtctgt agcgaccctt 24 0 

tgcaggcagc ggaacccccc acctggcgac aggtgcctct gcggccaaaa gccacgtgta 3 00 

taagatacac ctgcaaaggc ggcacaaccc cagtgccacg ttgtgagttg gatagttgtg 360 

gaaagagtca aatggctctc ctcaagcgta ttcaacaagg ggctgaagga tgcccagaag 420 

gtaccccatt gtatgggatc tgatctgggg cctcggtgca catgctttac gtgtgfcttag 480 

tcgaggttaa aaaacgtcta ggccccccga accacgggga cgtggttttc ctttgaaaaa 540 

cacgatgata ata 553 

<210> 29 
<211> 4692 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pDSredl-Nl plasmid from Clontech 



<400> 29 

tagttattaa tagtaatcaa ttacggggtc attagttcat agcccatata tggagttccg 



60 
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cgtfcacataa 
gacgtcaata 
atgggtggag 
aagtacgccc 
catgacctta 
catggtgafcg 
atttccaagt 
ggactttcca 
acggtgggag 
ccggactcag 
gatccaccgg 
ttcaaggtgc 
gagggccgcc 
ctgcccttcg 
aagcaccccg 
gagcgcgtga 
caggacggct 
cccgtaatgc 
gacggcgtgc 
ctggtggagt 
tacgtggacfc 
tacgagcgca 
atcagccata 
ctgaacctga 
aatggttaca 
cattctagtt 
taatattttg 
ggccgaaatc 
tgttccagtt 
aaaaaccgtc 
ggggtcgagg 
ttgacgggga 
cgctagggcg 
taatgcgccg 
tatttgttta 
ataaatgctt 
aatgtgtgtc 
agcatgcatc 
agaagtatgc 
cccatcccgc 
ttttttafctt 
ggaggctttt 
gtttcgcatg 
gctattcggc 
gctgtcagcg 
tgaactgcaa 
agctgtgctc 

ggggcaggat 

tgcaatgcgg 
acatcgcatc 
ggacgaHgag 
gcccgacggc 
ggaaaatggc 
tcaggacata 
ccgcttcctc 
ccttcttgac 
cccaacctgc 
ggaatcgttt 
ttcttcgccc 
acccgcgcfca 
cataaacgcg 
tggggccaat 
ggcccagggc 
tatatacttt 
ctttttgata 
gaccccgtag 
tgcttgcaaa 



cttacggtaa 
atgacgtatg 
tatttacggt 
cctattgacg 
tgggactttc 
cggttttggc 
ctccacccca 
aaatgtcgta 
gtctatataa 
atctcgagct 
tcgccaccat 
gcatggaggg 
cctacgaggg 
cctgggacat 
ccgacatccc 
tgaacttcga 
gcttcafccfca 
agaagaagac 
tgaagggcga 
tcaagtccat 
ccaagctgga 
ccgagggccg 
ccacatttgt 
aacafcaaaat 
aataaagcaa 
gfcggfctfcgfcc 
ttaaaattcg 
ggcaaaatcc 
tggaacaaga 
tatcagggcg 
tgccgtaaag 
aagccggcga 
ctggcaagtg 
ctacagggcg 
tttttctaaa 
caataatatt 
agttagggtg 
tcaattagtc 
aaagcat gc a 
ccctaactcc 
atgcagaggc 
ttggaggcct 
attgaacaag 
tatgactggg 
caggggcgcc 
gacgaggcag 
gacgttgtca 
ctcctgtcat 
cggctgcata 
gagcgagcac 
catcaggggc 
gaggatctcg 
cgcttttctg 
gcgttggcta 
gtgctttacg 
gagttcttct 
catcacgaga 
tccgggacgc 
accctagggg 
tgacggcaat 
gggttcggtc 
acgcccgcgt 
tcgcagccaa 
agattgattt 
atctcatgac 
aaaagatcaa 
caaaaaaacc 



atggcccgcc 
ttcccatagt 
aaactgccca 
tcaatgacgg 
ctacttggca 
agtacatcaa 
ttgacgtcaa 
acaacfcccgc 
gcagagctgg 
caagcttcga 
ggtgcgctcc 
caccgtgaac 
ccacaacacc 
cctgtccccc 
cgactacaag 
ggacggcggc 
caaggtgaag 
catgggctgg 
gatccacaag 
ctacatggcc 
catcacctcc 
ccaccacctg 
agaggtttta 
gaafcgcaafct 
tagcatcaca 
caaactcatc 
cgttaaattt 
cttataaatc 
gtccactatt 
afcggcccact 
cactaaatcg 
acgtggcgag 
tagcggtcac 
cgtcaggtgg 
tacattcaaa 
gaaaa aggaa 
tggaaagtcc 
agcaaccagg 
tctcaattag 
gcccagttcc 
cgaggccgcc 
aggcttttgc 
atggattgca 
cacaacagac 
cggttctttt 
cgcggctatc 
ctgaagcggg 
ctcaccttgc 
cgcttgatcc 
gtactcggat 
tcgcgccagc 
tcgtgaccca 
gattcatcga 
cccgtgatat 
gtatcgccgc 
gagcgggact 
tttcgattcc 
cggctggatg 
gaggctaact 
aaaaagacag 
ccagggctgg 
ttcttccttt 
cgtcggggcg 
aaaacttcat 
caaaatccct 
aggatcttct 
accgctacca 



tggctgaccg 
aacgccaata 
cttggcagta 
taaatggccc 
gtacatctac 

tgggcgtgga 
tgggagtttg 
cccattgacg 
tttagtgaac 
attctgcagt 
tccaagaacg 
ggccacgagt 
gtgaagctga 
cagttccagt 
aagctgtcct 
grtggtgaccg 
ttcatcggcg 
gaggcctcca 
gccctgaagc 
aagaagcccg 
cacaacgagg 
ttcctgtagc 
cttgctttaa 
gfcfcgttgtta 
aatttcacaa 
aatgtatctt 
ttgttaaatc 
aaaagaatag 
aaagaacgtg 
acgtgaacca 
gaaccctaaa 
aaaggaaggg 
gctgcgcgta 
cacttttcgg 
tatgtatccg 
gagtcctgag 
ccaggctccc 
tgtggaaagt 
tcagcaacca 
gcccattctc 
tcggcctctg 
aaagatcgat 
cgcaggttct 
aatcggctgc 
tgtcaagacc 
gtggctggcc 
aagggactgg 
fcccfcgccgag 
ggctacctgc 
ggaagccggt 
cgaactgttc 

tggcgatgcc 
ctgtggccgg 
tgctgaagag 
tcccgattcg 
ctggggttcg 
accgccgcct 
atcctccagc 
gaaacacgga 
aataaaacgc 
cactctgtcg 
tccccacccc 
gcaggccctg 
ttttaattta 
taacgtgagt 
tgagatcctt 
gcggtggttt 



cccaacgacc 
gggactttcc 
catcaagtgt 
gcctggcatt 
gtattagtca 
tagcggtttg 
ttttggcacc 
caaatgggcg 
cgtcagatcc 
cgacggtacc 
tcatcaagga 
tcgagatcga 
agg t gac caa 
acggctccaa 
tccccgaggg 
tgacccagga 
tgaacttccc 
ccgagcgcct 
tgaaggacgg 
tgcagctgcc 
actacaccat 
ggccgcgact 
aaaacctccc 
acttgtttat 
ataaagcatt 
aaggcg t aaa 
agctcatttt 
accgagatag 
gactccaacg 
tcaccctaat 
gggagccccc 
aagaaagcga 
accaccacac 
ggaaatgtgc 
ctcatgagac 
gcggaaagaa 
c age aggc ag 
ccccaggctc 
tagtcccgcc 
cgccccatgg 
agctattcca 
caagagacag 
ccggccgcfct 
tetgatgecg 
gacctgtccg 
aegaegggeg 
ctgctattgg 
aaagtatcca 
ccattcgacc 
cttgtcgatc 
gccaggctca 
tgettgeega 
ctgggtgtgg 
cttggcggcg 
cagcgcatcg 
aaatgaccga 
tctatgaaag 
gcgggga t c t 
aggagacaat 
acggtgttgg 
ataccccacc 
accccccaag 
ccatagcctc 
aaaggatcta 
tttcgttcca 
tttttctgcg 
gtttgccgga 



cccgcccatt 
attgaegtea 
ateatatgee 
abgcccagta 
tegctattae 
actcaegggg 
aaaatcaacg 
gtaggcgtgt 
getagegcta 
gcgggcccgg 
gttcatgege 
gggegaggge 
gggcggcccc 
ggtgtacgtg 
cttcaagtgg 
ctcctccctg 
ctccgacggc 
gtacccccgc 
cggccactac 
cggctactac 
cgtggagcag 
ctagatcata 
acacctcccc 
tgcagcttat 
tttttcactg 
ttgtaagcgt 
ttaaccaata 
ggttgagtgt 
teaaagggeg 
caagtttttt 
gatttagagc 
aaggagcggg 
ccgccgcgct 
gcggaacccc 
aataaccctg 
ccagctgtgg 
aagtatgcaa 
cccagcaggc 
cctaactccg 
ctgactaatt 
gaagtagtga 
gatgaggatc 
gggtggagag 
ccgtgttccg 
gtgccctgaa 
ttccttgcgc 
gcgaagtgcc 
tcatggctga 
accaagegaa 
aggatgatct 
aggegagcat 
atatcatggt 
cggaccgcta 
aatgggctga 
ccttctatcg 
ccaagcgacg 
gttgggcttc 
catgetggag 
aceggaagga 
gfccgtttgtt 
gagaccccat 
ttcgggtgaa 
aggttactca 
ggtgaagatc 
ctgagegtea 
cgtaatctgc 
tcaagagcta 



120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 



16- 



ccaactcttt 
ctagtgtagc 
gctctgctaa 
ttggactcaa 
tgcacacagc 
ctatgagaaa 
agggtcggaa 
agtcctgtcg 
gggcggagcc 
tggccttttg 
accgccatgc 



ttccgaaggt 
cgtagttagg 
tcctgttacc 
gacgatagtt 
ccagcttgga 
gcgccacgcfc 
caggagagcg 
ggtttcgcca 
tatggaaaaa 
ctcacatgtt 
at 



aactggcttc 
ccaccacttc 
agtggctgct 
accggataag 
gcgaacgacc 
tcccgaaggg 
cacgagggag 
cctctgactt 
cgccagcaac 
ctttcctgcg 



agcagagcgc 
aagaactctg 
gccagtggcg 
gcgcagcggt 
tacaccgaac 
agaaaggcgg 
cttccagggg 
gagcgtcgat 
gcggcctttt 
ttatcccctg 



agataccaaa 
tagcaccgcc 
ataagtcgtg 
cgggctgaac 
tgagatacct 
acaggtatcc 
gaaacgcctg 
ttttgtgatg 
tacggttcct 
attctgtgga 



tactgtcctt 
tacatacctc 
tcttaccggg 

ggggggttcg 

acagcgtgag 
ggtaagcggc 
gtatctttat 
ctcgtcaggg 
ggccttttgc 
taaccgtatt 



<210> 30 
<211> 4257 
<212> DMA 

<213> Artificial Sequence 
<220> 

<223> pPur plasmid. fxorn Clontech 



<400> 30 

ctgtggaatg 

atgcaaagca 

gcaggcagaa 

actccgccca 

ctaatttttt 

tagtgaggag 

ggccgccacg 

gacgaccttc 

cccgggccgt 

tcgacccgga 

tcgggctcga 

ccacgccgga 

agttgagcgg 

ggcccaagga 

agggtctggg 

ccgccttcct 

ccgtcaccgc 

ccggtgcctg 

atggctccga 

caccgactct 

aaaaacctcc 

aacttgttta 

aataaagcat. 

tatcatgtct 

ttgagaggac 

gtcacttaac 

tttaaaatat 

acaaatgtca 

ctcatcaaga 

cccacctgtg 

gcactccact 

ctgactgtca 

gtttgctaac 

tgacccttga 

gtttaacata 

aatatttcca 

ggcctcgtga 

tcaggtggca 

cattcaaata 

aaaaggaaga 

ttttgccttc 

cagttgggtg 

agttttcgcc 

gcggtattat 

cagaatgact 

gtaagagaat 



tgtgtcagtt 
tgcatctcaa 
gtatgcaaag 
tcccgcccct 
ttatttatgc 
gcttttttgg 
accggtgccg 
catgaccgag 
acgcaccctc 
ccgccacatc 
catcggcaag 
gagcgtcgaa 
ttcccggctg 
gcccgcgtgg 
cagcgccgtc 
ggagacctcc 
cgacgtcgag 
acgcccgccc 
ccgaagccga 
agaggatcat 
cacacctccc 
ttgcagctta 
ttttttcact 
ggatccccag 
attccaatca 
aaaaaggaaa 
c tgggaagt c 
acagcagaaa 
agcactgtgg 
taggttccaa 
ggataagcat 
actgtagcat 
acaccctgca 
atgggttttc 
gcagttaccc 
caggttaagt 
tacgcctatt 
cttttcgggg 
tgtatccgct 
gtatgagtat 
ctgtttttgc 
cacgagtggg 
ccgaagaacg 
cccgtgttga 
tggttgagta 
tatgcagtgc 



agggtgtgga 
ttagtcagca 
catgcatctc 
aactccgccc 
agaggccgag 
aggcctaggc 
ccaccatccc 
tacaagccca 
gccgccgcgt 
gagcgggtca 
gtgtgggtcg 

gcgggggcgg 

gccgcgcagc 
ttcctggcca 
gtgctccccg 
gcgccccgca 
gtgcccgaag 
cacgacccgc 
cccgggcggc 
aatcagccat 
cctgaacctg 
taatggttac 
gcattctagt 
gaagct cctc 
taggctgccc 
ttgggtaggg 
ccttccactg 
catacaagct 
ttgctgtgtt 
aatatctagt 
tatccttatc 
tttttggggt 
gctccaaagg 
cagcaccatt 
caataacctc 
cctcatttaa 
tttataggtt 
aaatgtgcgc 
catgagacaa 
tcaacat ttc 
tcacccagaa 
ttacatcgaa 
ttttccaatg 
cgccgggcaa 
ctcaccagtc 
tgccataacc 



aagtccccag 
accaggtgtg 
aattagtcag 
agttccgccc 
gccgcctcgg 
ttttgcaaaa 
ctgacccacg 
cggtgcgcct 
tcgccgacta 
ccgagctgca 
cggacgacgg 
tgttcgccga 
aacagatgga 
ccgtcggcgt 
gagtggaggc 
acctcccctt 
gaccgcgcac 
agcgcccgac 
cccgccgacc 
accacatttg 
aaacataaaa 
aaataaagca 
tgtggtttgt 

tgtgtcctca 
atccaccctc 
gtttttcaca 
ctgtgttcca 
gtcagctttg 
agtaatgtgc 
gttttcattt 
caaaacagcc 
tacagtttga 
ttccccacca 
ttcatgagtt 
agttttaaca 
attaggcaaa 
aatgtcatga 
ggaaccccta 
taaccctgat 
cgtgtcgccc 
acgctggtga 
ctggatctca 
atgagcactt 
gagcaactcg 
acagaaaagc 
atgagtgata 



gctccccagc 
gaaagtcccc 
caaccatagt 
attctccgcc 
cctctgagct 
agcttgcatg 
cccctgaccc 
cgccacccgc 
ccccgccacg 
agaactcttc 
cgccgcggtg 
gatcggcccg 
aggcctcctg 
ctcgcccgac 
ggccgagcgc 
ctacgagcgg 
ctggtgcatg 
cgaaaggagc 
ccgcacccgc 
tagaggtttt 
tgaatgcaat 
atagcatcac 
ccaaactcat 
taaaccctaa 
tgtgtcctcc 
gaccgctttc 
gaagtgttgg 
cacaagggcc 
aaaacaggag 
ttacttggat 
ttgtggtcag 
gcaggat at t 
acagcaaaaa 
ttttgtgtcc 
gtaacagctt 
ggaattcttg 
taataatggt 
tttgtttatt 
aaatgcttca 
ttattccctt 
aagtaaaaga 
acagcggtaa 
ttaaagttct 
gtcgccgcat 
atcttacgga 
acactgcggc 



aggcagaagt 
aggctcccca 
cccgccccta 
ccatggctga 
attccagaag 
cctgcaggtc 
ctcacaagga 
gacgacgtcc 
cgccacaccg 
ctcacgcgcg 
gcggtctgga 
cgcatggccg 
gcgccgcacc 
caccagggca 
gccggggtgc 
ctcggcttca 
acccgcaagc 
gcacgacccc 
ccccgaggcc 
acttgcttta 
tgttgttgtt 
aaatttcaca 
caatgtatct 
cctcctctac 
tgttaattag 
taagggtaat 
taaacagccc 
caacaccctg 
gcacattttc 
caggaaccca 
tgttcatctg 
tggtcctgta 
aatgaaaatt 
ctgaatgcaa 
cccacatcaa 
aagacgaaag 
ttcttagacg 
tttctaaata 
ataatattga 
ttttgcggca 
tgctgaagat 
gatccttgag 
gctatgtggc 
acactattct 
tggcatgaca 
caacttactt 



4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4692 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 



ctgacaacga 
gtaactcgcc 
gacaccacga 
cttactctag 
ccacttctgc 
gagcgtgggt 
gtagttatct 
gagataggtg 
ctttagattg 
gataatctca 
gtagaaaaga 
caaacaaaaa 
ctttttccga 
tagccgtagt 
ctaatcctgt 
tcaagacgat 
cagcccagct 
gaaagcgcca 
ggaacaggag 
gtcgggtttc 
agcctatgga 
tttgctcaca 
tttgagtgag 
gaggaagcgg 
caccgcatat 



tcggaggacc 
ttgatcgttg 
tgcctgcagc 
cttcccggca 
gctcggccct 
ctcgcggtat 
acacgacggg 
cctcactgat 
atttaaaact 
tgaccaaaat 
tcaaaggatc 
aaccaccgct 
aggtaactgg 
taggccacca 
taccagtggc 
agttaccgga 
tggagcgaac 
cgcttcccga 
agcgcacgag 
gccacctctg 
aaaacgccag 
tgttctttcc 
ctgataccgc 
aagagcgcct 
ggtgcactct 



gaaggagcta 
ggaaccggag 
aatggcaaca 
acaattaata 
tccggctggc 
cattgcagca 
gagtcaggca 
taagcattgg 
tcatttttaa 
cccttaacgt 
ttcttgagat 
accagcggtg 
cttcagcaga 
cttcaagaac 
tgctgccagt 
taaggcgcag 
gacctacacc 
agggagaaag 
ggagcttcca 
acttgagcgt 
caacgcggcc 
tgcgttatcc 
tcgccgcagc 
gatgcggtat 
cagtacaatc 



-17- 

accgcttttt 
ctgaatgaag 
acgttgcgca 
gactggatgg 
tggtttattg 
ctggggccag 
actatggatg 
taactgtcag 
tttaaaagga 
gagttttcgt 
cctttttttc 
gtttgtttgc 
gcgcagatac 
tctgtagcac 
ggcgataagt 
cggtcgggct 
gaactgagat 
gcggacaggt 

gggggaaacg 

cgatttttgt 
tttttacggt 
cctgattctg 
cgaacgaccg 
tttctcctta 
tgctctgatg 



tgcacaacat 
ccataccaaa 
aactattaac 
aggcggataa 
ctgataaatc 
atggtaagcc 
aacgaaatag 
accaagttta 
tctaggtgaa 
tccactgagc 
tgcgcgtaat 
cggatcaaga 
caaatactgt 
cgcctacata 
cgtgtcttac 
gaacgggggg 
acctacagcg 
atccggtaag 
cctggtatct 
gatgctcgtc 
tcctggcctt 
t ggat aac eg 
agegcagega 
cgcatctgtg 
ccgcatagtt 



gggggat cat 
egacgagegt 
tggegaacta 
agttgcagga 
tggagccggt 
ctcccgtatc 
acagatcget 
ctcatatata 
gatccttttt 
gtcagacccc 
ctgctgcttg 
gctaccaact 
ccttctagtg 
cctcgctctg 
cgggttggac 
ttcgtgcaca 
tgagctatga 
eggcagggtc 
ttatagtcct 

aggggggegg 

ttgctggcct 
tattaccgcc 
gtcagtgagc 
eggtatttea 
aagccag 



<210> 31 

<211> 8136 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pWE15 co sm id vector 
<300> 

<308> GenBank X65279 

<309> 1995-04-14 



2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
402O 
4080 
4140 
4200 
4257 



<400> 31 

ctatagtgag 

cctattttta 

t eggggaaat 

tccgctcatg 

gagtattcaa 

tttgctcacc 

gtgggttaca 

gaacgttttc 

gttgacgccg 

gagtactcac 

agtgctgcca 

ggaccgaagg 

cgttgggaac 

gcagcaatgg 

eggcaacaat 

gcccttccgg 

ggtatcattg 

aeggggagtc 

ctgattaagc 

aaacttcatt 

aaaatccctt 

ggatcttctt 

ccgctaccag 

actggcttca 

caccacttca 

gtggctgctg 

ceggataagg 

cgaacgacct 



tegtattatg 
taggttaatg 
gtgcgcggaa 
agacaataac 
catttcegtg 
cagaaacget 
tcgaactgga 
caatgatgag 
ggcaagagca 
cagtcacaga 
taaccatgag 
agctaaccgc 
eggagctgaa 
caacaaegtt 
taatagactg 
ctggctggtt 
cagcactggg 
aggcaactat 
attggtaact 
tttaatttaa 
aacgt gagt t 
gagatccttt 
cggtggtttg 
geagagegea 
agaactctgt 
ccagtggcga 
cgcagcggtc 
acaccgaact 



cggccgcgaa 
tcatgataat 
cccctatttg 
cctgataaat 
tcgcccttat 
ggtgaaagt a 
tctcaacagc 
cacttttaaa 
actcggtcgc 
aaagcatctt 
tgataacact 
ttttttgeae 
tgaagecata 
gegcaaacta 
gatggaggcg 
tattgetgat 
gccagatggt 
gg a t gaacga 
gtcagaccaa 
aaggatctag 
ttcgttccac 
ttttctgege 
tttgeeggat 
gataccaaat 
agcaccgcct 
taagtcgtgt 
gggctgaacg 
gagataccta 



ttcttgaaga 
aatggtttct 
tttatttttc 
gcttcaataa 
tccctttttt 
aaagatgctg 
ggtaagatcc 
gttctgetat 
cgcatacact 
aeggatggea 
gcggccaact 
aacatggggg 
ccaaacgacg 
ttaactggcg 
gataaagttg 
aaatctggag 
aagccctccc 
aatagacaga 
gtttactcat 
gtgaagatcc 
tgagcgtcag 
gtaatctget 
caagagctac 
actgtccttc 
acatacctcg 
ettacegggt 

gggggttcgt 

cagegtgage 



egaaagggee 
tagaegtcag 
taaatacatt 
tattgaaaaa 
gcggcatttt 
aagatcagtt 
ttgagagttt 

gtggcgcggt 

attctcagaa 
t gacagt aag 
tacttctgac 
atcatgtaac 
agegtgacac 
aactacttac 
caggaccact 
ccggtgagcg 
gtatcgtagt 
tegctgagat 
atatacttta 
tttttgataa 
accccgtaga 
gettgeaaac 
caactctttt 
tagtgtagcc 
etctgetaat 
tggactcaag 
gcacacagcc 
tatgagaaag 



tcgtgatacg 
gtggcacttt 
caaatatgta 
ggaagagtat 
gcttcctgtt 
gggtgcacga 
tcgccccgaa 
attatcccgt 
tgacttggtt 
agaattatgc 
aacgategga 
tegecttgat 
cacgatgcct 
tctagcttcc 
tctgcgctcg 
tgggtctege 
tatctacacg 
aggtgectea 
gattgattta 
tctcatgacc 
aaagatcaaa 
aaaaaaacca 
tccgaaggta 
gtagttaggc 
cctgttacca 
acgatagtta 
cagcttggag 
cgccacgctt 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 



18- 



ccgaagggag 
cgagggagct 
ctctgacttg 
gccagcaacg 
tttcctgcgt 
accgctcgcc 
cgctgacttc 
fcgctcaggtc 
ttcattctgc 
cacgatcatg 
accacaacta 
fctatttgtaa 
atgtttcagg 
tgtggtatgg 
cccctttaca 
acactctatg 
tacttataaa 
cctttgtggt 
aaaaacttag 
gaggagt aga 
agcaaaacag 
aggttggaat 
aaaattttat 
caccacagaa 
actcctgcag 
cggatttgca 
actcgcgagg 
gaggatcatc 
ggcggtggaa 
ccccagagtc 
tcgggagcgg 
tcagcaatat 
ccacagtcga 
tcgccatggg 
gttcggctgg 
cttccatccg 
tagccggatc 
caggagcaag 
cccttcccgc 
gccacgatag 
tgacaaaaag 
cgattgtctg 

ctgcgtgcaa 

atcttgatcc 
agggcttccc 
ataaaaccgc 
ttgcgcttgc 
accgtttctg 
agtgcttgcg 
actacttctg 
ggggcggaga 
ggactatggt 
ctggggactt 
ctgctgggga 
gcaggaccca 
ggatatgttc 
tccaattctt 
gtggcccggc 
cctacaatcc 
tcagcggtcc 
cctgatggtc 
gccggaagcg 
caagacgtag 
aacgtttggt 
ccgcaagcga 
cccagagcgc 
cggcgacgat 



aaaggcggac 
tccaggggga 
agcgtcgatt 
cggccttttt 
tatcccctga 
gcagccgaac 
cgcgtttcca 
gcagacgttt 
taaccagtaa 
cgcacccgtc 
gaatgcagtg 
ccattataag 
ttcaggggga 
ctgattatga 
aattaaaaag 
cctgtgtgga 
ggttacagaa 
gtaaatagca 
caattctgaa 
atgttgagag 
gttttcctca 
ctaaaataca 
atttacctta 
gtaaggttcc 
ttcgggggca 
ctgccggtag 
ggatcgagcc 
cagccggcgt 
tcgaaatctc 
ccgctcagaa 
cgataccgta 
cacgggtagc 
tgaatccaga 
tcacgacgag 
cgcgagcccc 
agtacgtgct 
aagcgtatgc 
gtgagatgac 
ttcagtgaca 
ccgcgctgcc 
aaccgggcgc 
ttgtgcccag 
tccatcttgt 
cctgcgccat 
aaccttacca 
ccagtctagc 
gttttccctt 
cggactggct 
gcagcgtgaa 
gaatagctca 
atgggcggaa 
tgctgactaa 
tccacacctg 
gcctggggac 
acgctgcccg 
tgccaagggt 
ggagtggtga 
tccatgcacc 
atgccaaccc 
aatgatcgaa 
gtcatctacc 
agaagaatca 
cccagcgcgt 
ggcgggacca 
caggccgatc 
tgccggcacc 
agtcatgccc 



aggtafcccgg 
aacgcctggt 
tttgtgatgc 
acggttcctg 
ttctgtggat 
gaccgagcgc 
gactttacga 
tgcagcagca 
ggcaaccccg 
agatccagac 
aaaaaaatgc 
ctgcaataaa 
ggtgtgggag 

tctctagtca 

ctaaaggtac 

gtaagaaaaa 

tatttttcca 

aagcaagcaa 

ggaaagtcct 

tcagcagtag 

ttaaaggcat 

caaacaatta 

gagctttaaa 

ttcacaaaga 

tggatgcgcg 

aactcgcgag 

cggggtgggc 

cccggaaaac 

gtgatggcag 

gaactcgtca 

aagcacgagg 

caacgctatg 

aaagcggcca 

atcctcgccg 

tgatgctctt 

cgctcgatgc 

agccgccgca 

aggagatcct 

acgtcgagca 

tcgtcctgca 

ccctgcgctg 

tcatagccga 

tcaatcatgc 

cagatccttg 

gagggcgccc 

tatcgccatg 

g tec aga tag 

ttctacgtgt 

agetttttge 

gaggecgagg 

ctgggcggag 

ttgagatgea 

gttgetgact 

tttccacacc 

agatgegecg 

tggtttgcgc 

ateegttage 

gcgacgcaac 

gttccatgtg 

gttaggctgg 

tgcctggaca 

taatggggaa 

cgggccgcca 

gtgacgaagg 

atcgtcgcgc 

tgtcctacga 

cgcgcccacc 



taagcggcag 
atctttatag 
tegtcagggg 
gecttttget 
aacegtatta 
agegagtcag 
aacaeggaaa 
gtcgcttcac 
ccagcctagc 
atgataagat 
tttatttgtg 
caagttaaca 
gttttttaaa 
aggcactata 
acaatttttg 
acagtatgtt 
taattttctt 
gagttctatt 

tggggtcttc 

cctcatcatc 

tccaccactg 

gaatcagtag 

tctctgtagg 

tccggaccaa 

gatagecget 

gtcgtccagc 

gaagaactcc 

gattccgaag 

gttgggcgtc 

agaaggegat 

aagcggtcag 

tcctgatagc 

ttttccacca 

tegggatgeg 

cgtccagatc 

gatgtttcgc 

ttgeatcage 

gccccggcac 

cagctgcgca 

gttcattcag 

acagceggaa 

atagcctctc 

gaaacgatcc 

gcggcaagaa 

cagctggcaa 

taagcccact 

cccagtagct 

tccgcttcct 

aaaagectag 

eggectaaat 

ttaggggegg 

tgetttgeat 

aattgagatg 

ctaactgaca 

cgtgcggctg 

attcacagtt 

gaggtgeege 

geggggagge 

ctcgccgagg 

taagagcege 

gcatggcctg 

ggccatccag 

tgeeggegat 

ettgagegag 

tecagegaaa 

gttgeatgat 

ggaaggagct 



ggteggaaca 

tcctgtcggg 

ggeggagect 

ggccttttgc 

ccgcctttga 

tgagcgagga 

ccgaagacca 

gttcgctcgc 

cgggtcctca 

acattgatga 

aaatttgtga 

acaacaattg 

gcaagtaaaa 

catcaaatat 

agcatagtta 

atgattataa 

gtatagcagt 

actaaacaca 

tacctttctc 

actagatggc 

ctcccattca 

tttaacacat 

tagtttgtcc 

ageggecat c 

gctggtttcc 

ctcaggcagc 

agcatgagat 

cccaaccttt 

gettggtegg 

agaaggegat 

cccattcgcc 

ggtccgccac 

tgatattegg 

cgccttgagc 

atcctgatcg 

ttggtggtcg 

catgatggat 

ttcgcccaat 

aggaacgccc 

ggcaccggac 

cacggcggca 

cacccaagcg 

tcatcctgtc 

agccatccag 

ttccggttcg 

gcaagctacc 

gacattcatc 

ttagcagccc 

gcctccaaaa 

aaaaaaaatt 

gatgggcgga 

acttctgcct 

catgetttge 

cacattccac 

ctggagatgg 

ctccgcaaga 

cggcttccat 

agacaaggta 

egcataaate 

gagegatect 

caacgcggca 

cctcgcgtcg 

aatggcctgc 

ggcgtgcaag 

gcggtcctcg 

aaagaagaca 

gactgggttg 



ggagagegea 

gtttcgccac 

atggaaaaac 

tcacatgttc 

gtgagctgat 

ageggaagag 

ttcatgttgt 

gtatcggtga 

acgacaggag 

gtttggacaa 

tgetattget 

cattcatttt 

cctctacaaa 

tccttattaa 

ttaatagcag 

ctgttatgcc 

gcagcttttt 

gcatgactca 

ttcttttttg 

atttcttctg 

tcagttccat 

tatacactta 

aattatgtca 

gtgcctcccc 

tggatgeega 

agctgaacca 

ccccgcgctg 

catagaaggc 

tcatttcgaa 

gege t gcgaa 

gccaagctct 

acccagccgg 

caagcaggca 

ctggcgaaca 

acaagacegg 

aatgggcagg 

actttctegg 

ageagecagt 

gtcgtggcca 

aggteggtet 

tcagagcagc 

geeggagaac 

tcttgatcag 

tttactttgc 

cttgctgtcc 

tgetttctet 

eggggtcage 

ttgcgccctg 

aagcctcctc 

agtcagecat 

gttaggggcg 

gctggggagc 

atacttctgc 

ageeggatet 

eggacgegat 

attgattggc 

tcaggtcgag 

tagggeggeg 

geegtgaega 

tgaagctgtc 

tcccgatgcc 

cgaacgccag 

ttctcgccga 

attccgaata 

c cgaaaa tga 

gtcataagtg 

aaggctctca 



1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

4020 

4080 

4140 

4200 

4260 

4320 

4380 

4440 

4500 

4560 

4620 

4680 

4740 

4800 

4860 

4920 

4980 

5040 

5100 

5160 

5220 

5280 

5340 

54O0 

5460 

5520 

5580 

5640 

57O0 



-19- 



agggcatcgg 

ggttgaggcc 
acagtccccc 
gaagtggcga 
acctgtggcg 
agcatgcgca 
tacccacatc 
ggtaatagtc 
aaacaaaaga 
gtcaatgcgc 
aactgttccc 
agtcaggtaa 
cacgcgcaca 
ggctttgctg 
cacgttgtcc 
gtctggtgat 
aaccagacac 
agggcagaaa 
cgcttcgatg 
gctggaccag 
catcaaggac 
gccggtgacc 
caaggtaacc 
aacgttgatc 
gcaaatggca 
catgatggaa 
tgataattat 
ctcgcgggtt 
tcataactta 
aaagcgagct 
tgga agt caa 
actggcagga 
atgactctgc 
agctgcgccg 
gtggtcgcca 
cggcggcaaa 
catatagcgc 
gcaagaggcc 
tgccgaggat 
aatttaactg 
aattcgcggc 



tcgacgctct 
gttgagcacc 
ggccacgggc 
gcccgatctt 
ccggtgatgc 
tatccatgct 
gtcatcgctt 
catgaaaatc 
gatggtgatc 
tggatatggg 
aactaaaatc 
tgaatcctga 
ccgtagaaag 
t gcgac aggc 
ggcgcggcga 
cfcgccttcta 
acagcaactg 
fcttgccgttg 
acgcttggcg 
cgcattcgtg 
gccgctatcg 
aatatctaca 
gtcagtgccg 
gaaaacgcgc 
gcagacaaga 
tgtttccccg 
tatcatttgc 
ttcgctattt 
atgtttttat 
ttttggcctc 
caaaaagcag 
acagggaatg 
cgccgtcata 
ggaggt tgaa 
tgatcgcgta 
gcggtcggac 
tagcagcacg 
cggcagtacc 
gacgatgagc 
tgataaacta 
cgcaattaac 



cccttatgcg 
gccgccgcaa 
ctgccaccat 
ccccatcggt 
cggccacgat 
tcgaccatgc 
tccactgctc 
cttgtattca 
tttctaagag 
atagatggga 
afctttgcacg 
tataaagaca 
tctttcagtt 
tcacgtctaa 
cggafcgfctct 
aatctggcac 
aataccagaa 
aacacctggt 
ttgagattga 
acaccgtctc 
caaatggtgc 
acatcagcct 
ataagttcaa 
tgaaaaacgc 
aagcgatgga 
gfcggtgttat 
gggtcctttc 
atgaaaattt 
ttaaaatacc 
tgtcgtttcc 
ctggctgaca 
cccgttctgc 
aaatggtatg 
gaactgcggc 
gtcgatagtg 
agtgctccga 
ccatagtgac 
ggcataacca 
gcattgttag 
ccgcattaaa 
cctcactaaa 



actcctgcat 
ggaatggtgc 
acccacgccg 
gatgtcggcg 
gcgtccggcg 
gctcacaaag 
tcgcgaataa 
taaatcctcc 
atgatggaat 
atatgctgat 
atcagcgcac 
ggfc tga t aaa 
gtgagcctgg 
aaggaaataa 
gtatgcgctg 
agccgaattg 
agaaaatcac 
caatacgcgt 
tacctctgct 
cttcgaactt 
tatccacgca 
tggtatccag 
agttaaaccfc 
tgctgaatgt 
tgaactggct 
ctggcagcag 
cggcgatccg 
tccggtttaa 
ctctgaaaag 
tttctctgtt 
ttttcggtgc 
gaggcggtgg 
ccgaaaggga 
aggccagcga 
gctccaagta 
gaacgggtgc 
tggcgatgct 
agcctatgcc 
atttcataca 
gcttatcgat 
ggatcc 



taggaagcag 
atgcaaggag 
aaacaagcgc 
atataggcgc 
tagaggatct 
taggtgaatg 
agafcggaaaa 
aggfcagctat 
ctcccttcag 
ttttatggga 
tacgaacttt 
tcagtcttct 
gcaaaccgtt 
atcatgggtc 
tttfctccgtg 

cgcgagcttg 

tttacctttc 
tttggtgagc 
gcacaaaagg 
attcgcaatg 
gcggcaatcg 
cgtgatgagc 
ggtgttgata 
gcggcgctgg 
tcctatgtcc 
tgccgtcgat 
ccttgttacg 
ggcgtttccg 
aaaggaaacg 
tttgtccgtg 
gagtatccgt 
caagggtaat 
tgctgaaatt 
ggcagatcca 
gcgaagcgag 
gcatagaaat 
gtcggaatgg 
tacagcatcc 
cggfcgcctga 
gataagcggt 



cccagtagta 
atggcgccca 
tcatgagccc 
cagcaaccgc 
tggcagtcac 
cgcaatgtag 
tcaatctcat 
atgcaaafcfcg 
tatcccgatg 
cagagttgcg 
acccacaaat 
acgcgcatcg 
aactttcggc 
ataaaattat 
gcgcgttgct 
gttttgctga 
tgacatcaga 
agcaatattg 
caatcgacga 
gagtgtcatt 
aaacacctca 
cagcgcagaa 
ccaacattga 
atgtcacaaa 
gcacggccat 
agtatgcaat 

gggcggcgac 

ttcttcttcg 
acaggtgctg 
gaatgaacaa 
accattcaga 
gaggtgcttt 
gagaacgaaa 
caggacgggt 
caggactggg 
tgcatcaacg 
acgatatccc 
agggtgacgg 
ctgcgttagc 
caaacatgag 



<210> 32 

<211> 2713 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pNEB193 plasmid 



<400> 32 

tcgcgcgttt 

cagcttgtct 

ttggcgggfcg 

accatatgcg 

attcgccatt 

tacgccagct 

tttcccagtc 

gcgccggatc 

gcgtaatcat 

aacatacgag 

acattaattg 

cafcfcaatgaa 

tcctcgctca 

tcaaaggcgg 

gcaaaaggcc 

aggctccgcc 



cggtgatgac 
gtaagcggat 
ccggggctgg 
gtgtgaaata 
caggctgcgc 
ggcgaaaggg 
acgacgttgt 
cttaattaag 
ggtcatagct 
ccggaagcat 
cgfctgcgctc 
tcggccaacg 
ctgactcgct 
taatacggtt 
agcaaaaggc 
cccctgacga 



ggtgaaaacc 
gccgggagca 
cttaactatg 
ccgcacagat 
aactgttggg 
ggatgtgctg 
aaaacgacgg 
tctagagtcg 
gtttcctgtg 
aaagtgtaaa 
actgcccgct 
cgcggggaga 
gcgctcggtc 
atccacagaa 
caggaaccgt 
gcatcacaaa 



tctgacacat 
gacaagcccg 
cggca t caga 
gcgtaaggag 
aagggcgatc 
caaggcgatt 
ccagtgaatt 
actgtttaaa 
tgaaattgtt 
gcctggggtg 
ttccagtcgg 
ggcggtttgc 
gttcggctgc 
tcaggggata 
aaaaaggccg 
aatcgacgct 



gcagctcccg 
tcagggcgcg 
gcagattgta 
aaaataccgc 
ggtgcgggcc 
aagttgggta 
cgagctcggt 
cctgcaggca 
atccgctcac 
cctaatgagt 
gaaacctgtc 
gtattgggcg 
ggcgagcggt 
acgcaggaaa 
cgttgctggc 
caagtcagag 



gagacggt ca 
tcagcgggtg 
ctgagagtgc 
atcaggcgcc 
tcttcgctat 
acgccagggt 
acccgggggc 
tgcaagcttg 
aattccacac 
gagctaactc 
gtgccagctg 
ctcttccgct 
atcagctcac 
gaacatgtga 
gtttttccat 
gtggcgaaac 



5760 
5820 
5880 
5940 
6000 
6060 
6120 
6180 
6240 
6300 
6360 
6420 
6480 
6540 
6600 
6660 
6720 
6780 
6840 
6900 
6960 
7020 
7080 
7140 
7200 
7260 
7320 
7380 
7440 
7500 
7560 
7620 
7680 
7740 
7800 
7860 
7920 
7980 
8040 
8100 
8136 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 



2400 
2460 



-20- 

ccgacaggac tataaagata ccaggcgttt ccccctggaa gctccctcgt gcgctctcct 102 0 

gttccgaccc tgccgcttac cggatacctg tccgcctttc tcccttcggg aagcgtggcg 10 80 

ctttctcata gctcacgctg taggtatctc agttcggtgt aggtcgttcg ctccaagctg 114 0 

ggctgtgtgc acgaaccccc cgttcagccc gaccgctgcg ccttat ccgg taactatcgt 12 0 0 

cttgagtcca acccggtaag acacgactta tcgccactgg cagcagccac tggtaacagg 12 60 

attagcagag cgaggtatgt aggcggtgct acagagttct tgaagtggtg gcctaactac 1320 

ggctacacta gaaggacagt atttggtatc tgcgctctgc tgaagccagt taccttcgga 13 80 

aaaagagttg gtagctcttg atccggcaaa caaaccaccg ctggtagcgg tggttttttt 1440 

gtttgcaagc agcagattac gcgcagaaaa aaaggatctc aagaagatcc tttgatcttt 150 0 

tctacggggt ctgacgctca gtggaacgaa aactcacgtt aagggatttt ggtcatgaga 1560 

ttatcaaaaa ggatcttcac ctagatcctt ttaaattaaa aatgaagttt taaatcaatc 16 2 0 

taaagtatat atgagtaaac ttggtctgac agttaccaat gcttaatcag tgaggcacct 16 80 

atctcagcga tctgtctatt tcgttcatcc atagttgcct gactccccgt cgtgtagata 1740 

actacgatac gggagggctt accatctggc cccagtgctg caatgatacc gcgagaccca 1800 

cgctcaccgg ctccagattt atcagcaata aaccagccag ccggaagggc cgagcgcaga 18 60 

agtggtcctg caactttatc cgcctccatc cagtctatta attgttgccg ggaagctaga 1920 

gtaagtagtt cgccagttaa tagtttgcgc aacgttgttg ccattgctac aggcatcgtg 1980 

gtgtcacgct cgtcgtttgg tatggcttca ttcagctccg gttcccaacg atcaaggcga 2040 

gttacatgat cccccatgtt gtgcaaaaaa gcggttagct ccttcggtcc tccgatcgtt 21 OO 

gtcagaagta agttggccgc agtgttatca ctcatggtta tggcagcact gcataattct 2160 

cttactgtca tgccatccgt aagatgcttt tctgtgactg gtgagtactc aaccaagtca 2220 

ttctgagaat agtgtatgcg gcgaccgagt tgctcttgcc cggcgtcaat acgggataat 2 2 80 

accgcgccac atagcagaac tttaaaagtg ctcatcattg gaaaacgttc ttcggggcga 2340 

aaactctcaa ggatcttacc gctgttgaga tccagttcga tgtaacccac tcgtgcaccc 

aactgatctt cagcatcttt tactttcacc agcgtttctg ggtgagcaaa aacaggaagg 

caaaatgccg caaaaaaggg aataagggcg acacggaaat gttgaatact catactcttc 2 52 0 

ctttttcaat attattgaag catttatcag ggttattgtc tcatgagcgg atacatattt 25 80 

gaatgtattt agaaaaataa acaaataggg gttccgcgca catttccccg aaaagtgcca 2 64 0 

cctgacgtct aagaaaccat tattatcatg acattaacct ataaaaatag gcgtatcacg 27 00 

aggccctttc gtc 2713 

<210> 33 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attP 
<400> 33 

cagctttttt atactaagtt g 21 

<210> 34 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attB 
<400> 34 

ctgctttttt atactaactt g 21 

<210> 35 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attL 
<400> 35 

ctgctttttt atactaagtt g 21 



<210> 36 
<211> 21 
<212> DNA 
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<213> Artificial Sequence 
<220> 

<223> attR 
<400> 36 

cagctttttt atactaactt g 

<210> 37 
<211> 1071 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Integrase E174R 

<221> CDS 

<222> (1) . . . (1071) 

<223> Nucleotide sequence . encoding Integrase E147R 
<400> 37 

atg gga aga agg cga agt cat gag cgc egg gat tta ccc cct aac ctt 
Met Gly Arg Arg Arg Ser His Glu Arg Arg Asp Leu Pro Pro Asn Leu 
15 10 I 5 



aaa gag ttt gga tta ggc aga gac agg cga ate gca ate act gaa get 

Lys Glu Phe Gly Leu Gly Arg Asp Arg Arg lie Ala lie Thr Glu Ala 

35 40 45 

ata cag gec aac att gag tta ttt tea gga cac aaa cac aag cct ctg 

lie Gin Ala Asn He Glu Leu Phe Ser Gly His Lys His Lys Pro Leu 
50 55 60 

aca gcg aga ate aac agt gat aat tec gtt acg tta cat tea tgg ctt 

Thr Ala Arg He Asn Ser Asp Asn Ser Val Thr Leu His Ser Trp Leu 

65 ~* 70 75 80 



gat get cca ctt gaa gac ate acc aca aaa gaa att gcg gca atg etc 
Asp Ala Pro Leu Glu Asp Xle Thr Thr Lys Glu lie Ala Ala Met Leu 
115 120 125 



tea aca ctg age gat gca ttc cga gag gca ata get gaa ggc cat ata 

Ser Thr Leu Ser Asp Ala Phe Arg Glu Ala He Ala Glu Gly His lie 

145 150 155 160 

aca aca aac cat gtc get gec act cgc gca gca aaa tct aga gta agg 

Thr Thr Asn His Val Ala Ala Thr Arg Ala Ala Lys Ser Arg Val Arg 

165 170 175 

aga tea aga ctt acg get gac gaa tac ctg aaa att tat caa gca gca 

Arg Ser Arg Leu Thr Ala Asp Glu Tyr Leu Lys He Tyr Gin Ala Ala 



21 



48 



tat ata aga aac aat gga tat tac tgc tac agg gac cca agg acg ggt 96 
Tyr He Arg Asn Asn Gly Tyr Tyr Cys Tyr Arg Asp Pro Arg Thr Gly 
20 25 30 



144 



192 



240 



gat cgc tac gaa aaa 'ate ctg gec age aga gga ate aag cag aag aca 288 

Asp Arq Tyar Glu Lys lie Leu Ala Ser Arg Gly He Lys Gin Lys Thr 
85 90 95 

etc ata aat tac atg age aaa att aaa gca ata agg agg ggt ctg cct 33 6 

Leu He Asn Tyr Met Ser Lys He Lys Ala He Arg Arg Gly Leu Pro 

100 105 110 



384 



aat gga tac ata gac gag ggc aag gcg gcg tea gec aag tta ate aga 4 32 
Asn Gly Tyr He Asp Glu Gly Lys Ala Ala Ser Ala Lys Leu He Arg 
130 135 140 



480 



528 



576 
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180 185 190 

gaa tea tea cca tgt tgg etc aga ctt gca atg gaa ctg get gtt gtt 624 
Gin Ser Ser Pro Cys Trp Leu Arg Leu Ala Met Glu Leu Ala Val Val 
195 200 205 

acc ggg caa cga gtt ggt gat tta tgc gaa atg aag tgg tct gat ate 672 
Thr Gly Gin Arg Val Gly Asp Leu Cys Glu Met Lys Trp Ser Asp He 
210 215 220 

gta gat gga tat ctt tat gtc gag caa age aaa aca ggc gta aaa att 
Val Asp Gly Tyr Leu Tyr Val Glu Gin Ser Lys Thr Gly Val Lys He 
225 230 235 240 

gec ate cca aca gca ttg cat att gat get etc gga ata tea atg aag 
Ala He Pro Thr Ala Leu His lie Asp Ala Leu Gly lie Ser Met Lys 
245 250 255 

gaa aca ctt gat aaa tgc aaa gag att ctt ggc gga gaa acc ata att 
Glu Thr Leu Asp Lys Cys Lys Glu lie Leu Gly Gly Glu Thr lie lie 
260 265 270 

gca tct act cgt cgc gaa ccg ctt tea tec ggc aca gta tea agg tat 
Ala Ser Thr Arg Arg Glu Pro Leu Ser Ser Gly Thr Val Ser Arg Tyr 
275 280 285 

ttt atg cgc gca cga aaa gca tea ggt ctt tec ttc gaa ggg gat ccg 912 
Phe Met Arg Ala Arg Lys Ala Ser Gly Leu Ser Phe Glu Gly Asp Pro 
290 295 300 



cet acc ttt cac gag ttg cgc agt ttg tct gca aga etc tat gag aag 
Pro Thr Phe His Glu Leu Arg Ser Leu Ser Ala Arg Leu Tyr Glu Lys 
305 310 315 320 



att gaa ate aaa taa 
lie Glu He Lys * 

355 



<210> 38 


























<211> 356 


























<212> PRT 


























<213> Artificial 


. Sequence 




















<220> 


























<22 3> Integrase 


E147R 






















<400> 38 


























Met Gly Arg Arg 


Arg 


Ser 


His 


Glu 


Arg 


Arg 


Asp 


Leu 


Pro 


Pro 


Asn 


Leu 


1 


5 










10 










15 




Tyr lie Arg Asn 


Asn 


Gly 


Tyr 


Tyr 


Cys 


Tyr 


Arg 


Asp 


Pro 


Arg 


Thr 


Gly 


20 








25 










30 






Lys Glu Phe Gly 


Leu 


Gly 


Arg 


Asp 


Arg 


Arg 


He 


Ala 


He 


Thr 


Glu 


Ala 


35 






40 










45 








He Gin Ala Asn 


lie 


Glu 


Leu 


Phe 


Ser 


Gly 


His 


Lys 


His 


Lys 


Pro 


Leu 


50 






55 










60 










Thr Ala Arg He 


Asn 


Ser 


Asp 


Asn 


Ser 


Val 


Thr 


Leu 


His 


Ser 


Trp 


Leu 


65 




70 










75 










80 


Asp Arg Tyr Glu 


Lys 


He 


Leu 


Ala 


Ser 


Arg 


Gly 


He 


Lys 


Gin 




Thr 



720 



768 



816 



864 



960 



cag ata age gat aag ttt get caa cat ctt etc ggg cat aag teg gac 1008 

Gin He Ser Asp Lys Phe Ala Gin His Leu Leu Gly His Lys Ser Asp 

325 330 335 

acc atg gca tea cag tat cgt gat gac aga ggc agg gag tgg gac aaa 1056 

Thr Met Ala Ser Gin Tyr Arg Asp Asp Arg Gly Arg Glu Trp Asp Lys 
340 345 350 



1071 
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85 










90 










95 




Leu 


lie 


Asn 


Tyr 
100 


Met 


Ser 


Lys 


Xle 


Lys 
105 


Ala 


He 


Arg 


Arg 


Gly 
110 


Leu 


Pro 


Asp 


Ala 


Pro 


Leu 


Glu 


Asp 


lie 


Thr 


Thr 


Lys 


Glu 


He 


Ala 


Ala 


Met 


Leu 




115 










120 










125 








Asn 


Gly 


Tyr 


lie 


Asp 


Glu 


Gly 


Lys 


Ala 


Ala 


Ser 


Ala 


Lys 


Leu 


He 


Arg 




130 






135 










140 










Ser 


Thr 


Leu 


Ser 


Asp 


Ala 


Phe 


Arg 


Glu 


Ala 


He 


Ala 


Glu 


Gly 


His 


He 


145 








150 










155 










160 


Thr 


Thr 


Asn 


His 


Val 


Ala 


Ala 


Thr 


Arg 


Ala 


Ala 


Lys 


Ser Arg 


Val 


Arg 










165 










170 










175 




Arg 


Ser 


Arg 


Leu 


Thr 


Ala 


Asp 


Glu 


Tyr 


Leu 


Lys 


He 


Tyr 


Gin 


Ala 


Ala 




180 










185 










190 






Glu 


Ser 


Ser 


Pro 


Cys 


Trp 


Leu 


Arg 


Leu 


Ala 


Met 


Glu 


Leu 


Ala 


Val 


Val 






195 






200 










205 








Thr 


Gly 


Gin 


Arg 


Val 


Gly 


Asp 


Leu 


Cys 


Glu 


Met 


Lys 


Trp 


Ser 


Asp 


lie 




210 








215 










220 










Val 


Asp 


Gly 


Tyr 


Leu 


Tyr 


Val 


Glu 


Gin 


Ser 


Lys 


Thr 


Gly Val 


Lys 


He 


225 




230 










235 










240 


Ala 


He 


Pro 


Thr 


Ala 
245 


Leu 


His 


lie 


Asp 


Ala 
250 


Leu 


Gly 


He 


Ser 


Met 
255 


Lys 


Glu 


Thr 


Leu 


Asp 


Lys 


Cys 


Lys 


Glu 


He 


Leu 


Gly 


Gly 


Glu 


Thr 


He 


He 








260 




265 










270 






Ala 


Ser 


Thr 


Arg 


Arg 


Glu 


Pro 


Leu 


Ser 


Ser 


Gly Thr 


Val 


Ser 


Arg 


Tyr 






275 








280 










285 








Phe 


Met 


Arg 


Ala 


Arg 


Lys 


Ala 


Ser 


Gly 


Leu 


Ser 


Phe 


Glu 


Gly 


Asp 


Pro 




290 




295 










300 










Pro 


Thr 


Phe 


His 


Glu 


Leu 


Arg 


Ser 


Leu 


Ser 


Ala 


Arg 


Leu 


Tyr 


Glu 


Lys 


305 










310 










315 










320 


Gin 


He 


Ser 


Asp 


Lys 


Phe 


Ala 


Gin 


His 


Leu 


Leu 


Gly 


His 


Lys 


Ser 


Asp 








325 










330 










335 




Thr 


Met 


Ala 


Ser 
340 


Gin 


Tyr 


Arg 


Asp 


Asp 
345 


Arg 


Gly 


Arg 


Glu 


Trp 
350 


Asp 


Lys 


lie 


Glu 


He 
355 


Lys 



























<210> 39 
<211> 876 
<212> DNA 

<213> Discosoma species 

<220> 
<221> CDS 

<222> (45) . . . (737) 

<223> Nucleotide sequence encoding red flourescent 
protein (FP593) 

<300> 

<3 08> GenBanJc AF272711 
<309> 2000-09-26 

<400> 39 

agtttcagcc agtgacaggg tgagctgcca ggtattctaa caag atg agt tgt tec 56 

Met Ser Cys Ser 
1 

aag aat gtg ate aag gag ttc atg agg ttc aag gtt cgt atg gaa gga 104 
Lys Asn Val He Lys Glu Phe Met Arg Phe Lys Val Arg Met Glu Gly 
5 10 15 20 

acg gtc aat ggg cac gag ttt gaa at a aaa ggc gaa ggt gaa ggg agg 152 
Thr Val Asn Gly His Glu Phe Glu He Lys Gly Glu Gly Glu Gly Arg 
25 30 35 

cct tac gaa ggt cac tgt tec gta aag ctt atg gta acc aag ggt gga 200 
Pro Tyr Glu Gly His Cys Ser Val Lys Leu Met Val Thr Lys Gly Gly 



-24- 

40 45 50 

cct ttg cca ttt get ttt gat att ttg tea cca caa ttt cag tat gga 24 8 

Pro Leu. Pro Phe Ala Phe Asp He Leu Ser Pro Gin Phe Gin Tyr Gly 

55 60 65 

age aag gta tat gtc aaa cac cct gee gac ata cca gac tat aaa aag 296 

Ser Lys Val Tyr Val Lys His Pro Ala Asp He Pro Asp Tyr Lys Lys 
70 75 80 

ctg tea ttt cct gag gga ttt aaa tgg gaa agg gtc atg aac ttt gaa 344 

Leu Ser Phe Pro Glu Gly Phe Lys Trp Glu Arg Val Met Asn Phe Glu 
85 90 95 100 



gac ggt ggc gtg gtt act gta tec caa gat tec agt ttg aaa gac ggc 
Asp Gly Gly Val Val Thr Val Ser Gin Asp Ser Ser Leu Lys Asp Gly 
105 110 115 



392 



tgt ttc ate tac gag gtc aag ttc att ggg gtg aac ttt cct tct gat 440 
Cys Phe He Tyr Glu Val Lys Phe lie Gly Val Asn Phe Pro Ser Asp 
120 125 130 

gga cct gtt atg cag agg agg aca egg ggc tgg gaa gee age tct gag 488 
Gly Pro Val Met Gin Arg Arg Thr Arg Gly Trp Glu Ala Ser Ser Glu 
135 140 145 

cgt ttg tat cct cgt gat ggg gtg ctg aaa gga gac ate cat atg get 53 6 
Arg Leu Tyr Pro Arg Asp Gly Val Leu Lys Gly Asp lie His Met Ala 
150 155 160 

ctg agg ctg gaa gga ggc ggc cat tac etc gtt gaa ttc aaa agt att 5 84 
Leu Arg Leu Glu Gly Gly Gly His Tyr Leu Val Glu Phe Lys Ser lie 
165 170 175 180 

tac atg gta aag aag cct tea gtg cag ttg cca ggc tac tat tat gtt 632 
Tyr Met Val Lys Lys Pro Ser Val Gin Leu Pro Gly Tyr Tyr Tyr Val 
185 190 195 

gac tec aaa ctg gat atg acg age cac aac gaa gat tac aca gtc gtt 68 0 
Asp Ser Lys Leu Asp Met Thr Ser His Asn Glu Asp Tyr Thr. Val Val 
200 205 210 

gag cag tat gaa aaa acc cag gga cgc cac cat ccg ttc att aag cct 72 8 
Glu Gin Tyr Glu Lys Thr Gin Gly Arg His His Pro Phe lie Lys Pro 
215 220 225 

ctg cag tga actcggctca gtcatggatt ageggtaatg gecacaaaag 77 7 
Leu Gin * 
230 

gcacgatgat cgttttttag gaatgeagee aaaaattgaa ggttatgaca gtagaaatac 837 

aagcaacagg etttgettat taaacatgta attgaaaac 876 

<210> 40 
<211> 230 
<212> PRT 

<213> Discosoma species 
<400> 40 

Met Ser Cys Ser Lys Asn Val lie Lys Glu Phe Met Arg Phe Lys Val 

1 5 10 15 

Arg Met Glu Gly Thr Val Asn Gly His Glu Phe Glu lie Lys Gly Glu 

20 25 30 

Gly Glu Gly Arg Pro Tyr Glu Gly His Cys Ser Val Lys Leu Met Val 

35 " 40 45 

Thr Lys Gly Gly Pro Leu Pro Phe Ala Phe Asp lie Leu Ser Pro Gin 
50 ** " 55 60 



-25- 



P)ne 


Gin 


Tyr 


Gly 


65 








Asp 


Tyr 


Lys 


Lys 


Met 


Asn 


Phe 


Glu 








100 


Leu 


Lys 


Asp 


Gly 






115 




Phe 


Pro 


Ser 


Asp 




130 






Ala 


Ser 


Ser 


Glu 


145 








lie 


His 


Met 


Ala 


Phe 


Lys 


Ser 


He 






180 


Tyr 


Tyr 


Tyr 


Val 






195 




Tyr 


Thr 


Val 


Val 




210 






Phe 


lie 


Lys 


Pro 


225 









Ser 


Lys 


Val 


Tyr 




70 






Leu 


Ser 


Phe 


Pro 


85 








Asp 


Gly 


Gly 


Val 


Cys 


Phe 


He 


Tyr 








120 


Gly 


Pro 


Val 


Met 






135 




Arg 


Leu 


Tyr 


Pro 




150 






Leu 


Arg 


Leu 


Glu 


165 








Tyr 


Met 


Val 


Lys 


Asp 


Ser 


Lys 


Leu 








200 


Glu 


Gin 


Tyr 


Glu 






215 




Leu 


Gin 








230 







Val Lys His Pro 
75 

Glu Gly Phe Lys 
90 

Val Thr Val Ser 
105 

Glu Val Lys Phe 

Gin Arg Arg Thr 
140 

Arg Asp Gly Val 
155 

Gly Gly Gly His 
170 

Lys Pro Ser Val 
185 

Asp Met Thr Ser 

Lys Thr Gin Gly 
220 



Ala Asp He Pro 
80 

Trp Glu Arg Val 
95 

Gin Asp Ser Ser 
HO 

He Gly Val Asn 
125 

Arg Gly Trp Glu 

Leu Lys Gly Asp 
160 

Tyr Leu Val Glu 
175 

Gin Leu Pro Gly 
190 

His Asn Glu Asp 
205 

Arg His His Pro 



<210> 41 

<211> 25 

<212> DNA 

<213> Artificial Sequence 
<220> 

<2 23> m-att; 



<221> misc_dif ference 
<222> 18 

<223> n is a or g or c or t/u 
<400> 41 

rkycwgcttt yktrtacnaa stsgb 

<210> 42 

<211> 25 

<212> DNA 

<213> Artificial Sequence 
<220> 

<2 23> m-attB ; 



<221> misc_dif ference 
<222> 18 

<2 23> n is a or g or c or t/u 
<400> 42 

agccwgcttt yktrtacnaa ctsgb 

<210> 43 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> m-attR 

<221> misc_dif ference 
<222> 18 

<223> n is a or g or c or t/u 
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<400> 43 

gttcagcttt cktrtacnaa ctsgb 25 

<210> 44 

<211> 25 

<212> DNA - . 

<213> Artificial Sequence 

<220> 

<223> m-attli 

<221> misc_dif ference 

<222> 18 

<223> n is a or g or c or t/u 

<400> 44 

agccwgcttt cktrtacnaa gtsgb 25 

<210> 45 

<211> 25 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> m-attPl 

<221> raisc^difference 

<222> 18 

<223> n is a or g or c or t/u 

<400> 45 

gttcagcttt yktrtacnaa gtsgb 25 

<210> 46 

<211> 25 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attBl 

<400> 46 

agcctgcttt tttgtacaaa cttgt 25 

<210> 47 

<211> 25 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attB2 

<400> 47 

agcctgcttt cttgtacaaa cttgt 25 

<210> 48 

<211> 25 

<212> DNA 

<213> Artificial Sequence 
<220> 

<22 3> attB3 

<400> 48 

acccagcttt cttgtacaaa cttgt 25 



<210> 49 



<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attRl 
<40O> 49 

gttcagcttt tttgtacaaa cttgt 

<2lo> 50 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attR2 
<400> 50 

gttcagcttt cttgtacaaa cttgt 

<210> 51 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attR3 
<400> 51 

gttcagcttt cttgtacaaa gttgg 

<210> 52 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attLl 
<400> 52 

agcctgcttt tttgtacaaa gttgg 

<210> 53 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attL2 
<400> 53 

agcctgcttt cttgtacaaa gttgg 

<210> 54 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attL3 
<400> 54 

acccagcttt cttgtacaaa gttgg 



<210> 55 
<211> 25 
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<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attPl 
<400> 55 

gttcagcttt tttgtacaaa gttgg 25 

<210> 56 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attP2,P3 
<40O> 56 

gttcagcttt cttgtacaaa gttgg 25 

<210> 57 
<211> 34 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> IiOX P site 
<400> 57 

ataacttcgt ataatgtatg ctatacgaag ttat 34 

<210> 58 

<211> 1032 

<212> DNA 

<213> Escherichia coli 

<220> 
<221> CDS 

<222> (1) . . - (1032) 

<223> nucleotide sequence encoding Cre recombina.se 
<400> 58 

atg tec aat tta ctg acc gta cac caa aat ttg cct gca tta ccg gtc 48 
Met Ser Asn Leu Leu Thr Val His Gin Asn Leu Pro Ala Leu Pro Val 
15 10 15 

gat gca acg agt gat gag gtt cgc aag aac ctg atg gac atg ttc agg 96 
Asp Ala Thr Ser Asp Glu Val Arg Lys Asn Leu Met Asp Met Phe Arg 
20 25 30 

gat cgc cag gcg ttt tct gag cat acc tgg aaa atg ctt ctg tec gtt 144 
Asp Arg Gin Ala Phe Ser Glu His Thr Trp Lys Met Leu Leu Ser Val 
35 40 45 

tgc egg teg tgg gcg gca tgg tgc aag ttg aat aac egg aaa tgg ttt 192 
Cys Arg Ser Trp Ala Ala Trp Cys Lys Leu Asn Asn Arg Lys Trp Phe 
50 55 60 

ccc gca gaa cct gaa gat gtt cgc gat tat ctt eta tat ctt cag gcg 240 
Pro Ala Glu Pro Glu Asp Val Arg Asp Tyr Leu Leu Tyr Leu Gin Ala 
65 70 75 80 

cgc ggt ctg gca gta aaa act ate cag caa cat ttg ggc cag eta aac 288 
Arg Gly Leu Ala Val Lys Thr He Gin Gin His Leu Gly Gin Leu Asn 
85 90 95 



atg ctt cat cgt egg tec ggg ctg cca cga cca agt gac age aat get 



336 



-29- 

Met lieu His Arg Arg Ser Gly Leu Pro Arg Pro Sex Asp Ser Asn Ala 
100 " 105 110 

gtt tea ctg gtt atg egg egg ate cga aaa gaa aac gtt gat gee ggt 
Val Ser Leu Val Met Arg Arg He Arg Lys Glu Asn Val Asp Ala Gly 
115 120 125 

gaa cgt gca aaa cag get eta gcg ttc gaa cgc act gat ttc gac cag 
Glu Arg Ala Lys Gin Ala Leu Ala Phe Glu Arg Thr Asp Phe Asp Gin 
130 135 140 



att gee agg ate agg gtt aaa gat ate tea cgt act gac ggt ggg aga 
He Ala Arg lie Arg Val Lys Asp He Ser Arg Thr Asp Gly Gly Arg 
180 185 190 



3 84 



432 



gtt cgt tea etc atg gaa aat age gat cgc tgc cag gat at a cgt aat 480 

Val Arg Ser Leu Met Glu Asn Ser Asp Arg Cys Gin Asp He Arg Asn 

14 5 15 0 155 160 

ctg gca ttt ctg ggg att get tat aac acc ctg tta cgt ata gee gaa 52 8 

Leu Ala Phe Leu Gly He Ala Tyr Asn Thr Leu Leu Arg He Ala Glu 

165 170 175 



576 



atg tta ate cat att ggc aga acg aaa acg ctg gtt age acc gca ggt 624 
Met Leu He His He Gly Arg Thr Lys Thr Leu Val Ser Thr Ala Gly 
195 200 205 

gta gag aag gca ctt age ctg ggg gta act aaa ctg gtc gag cga tgg 672 
Val Glu Lys Ala Leu Ser Leu Gly Val Thr Lys Leu Val Glu Arg Trp 
210 215 220 

att tec gtc tct ggt gta get gat gat ccg aat aac tac ctg ttt tgc 72 0 

He Ser Val Ser Gly Val Ala Asp Asp Pro Asn Asn Tyr Leu Phe Cys 
225 230 235 240 

egg gtc aga aaa aat ggt gtt gee gcg cca tct gee acc age cag eta 768 
Arg Val Arg Lys Asn Gly Val Ala Ala Pro Ser Ala Thr Ser Gin Leu 
245 250 255 

tea act cgc gee ctg gaa ggg att ttt gaa gca act cat cga ttg att 816 
Ser Thr Arg Ala Leu Glu Gly He Phe Glu Ala Thr His Arg Leu He 
260 265 270 

tac ggc get aag gat gac tct ggt cag aga tac ctg gee tgg tct gga 864 
Tyr Gly Ala Lys Asp Asp Ser Gly Gin Arg Tyr Leu Ala Trp Ser Gly 
275 280 285 

cac agt gee cgt gtc gga gee gcg cga gat atg gec cgc get gga gtt 912 
His Ser Ala Arg Val Gly Ala Ala Arg Asp Met Ala Arg Ala Gly Val 
290 295 300 

tea ata ccg gag ate atg caa get ggt ggc tgg acc aat gta aat att 960 
Ser He Pro Glu He Met Gin Ala Gly Gly Trp Thr Asn Val Asn He 
305 310 315 320 

gtc atg aac tat ate cgt aac ctg gat agt gaa aca ggg gca atg gtg 1008 
Val Met Asn Tyr He Arg Asn Leu Asp Ser Glu Thr Gly Ala Met Val 
325 330 335 

cgc ctg ctg gaa gat ggc gat tag 1032 
Arg Leu Leu Glu Asp Gly Asp * 
340 



<210> 59 
<211> 343 
<212> PRT 
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<213> Escherichia coli 



<400> 59 

Met Ser Asn Leu Leu Thr Val His Gin Asn Leu Pro Ala Leu Pro Val 
1 5 10 15 

Asp Ala Thr Ser Asp Glu Val Arg Lys Asn Leu Met Asp Met Phe Arg 

20 25 30 

Asp Arg Gin Ala Phe Ser Glu His Thr Trp Lys Met Leu Leu Ser Val 

35 40 45 

Cys Arg Ser Trp Ala Ala Trp Cys Lys Leu Asn Asn Arg Lys Trp Phe 

50 55 60 

Pro Ala Glu Pro Glu Asp Val Arg Asp Tyr Leu Leu Tyr Leu Gin Ala 
65 70 75 80 

Arq Gly Leu Ala Val Lys Thr lie Gin Gin His Leu Gly Gin Leu Asn 

85 90 95 

Met Leu His Arg Arg Ser Gly Leu Pro Arg Pro Ser Asp Ser Asn Ala 

100 105 110 

Val Ser Leu Val Met Arg Arg lie Arg Lys Glu Asn Val Asp Ala Gly 

115 120 125 

Glu Arg Ala Lys Gin Ala Leu Ala Phe Glu Arg Thr Asp Phe Asp Gin 

130 135 140 

Val Arq Ser Leu Met Glu Asn Ser Asp Arg Cys Gin Asp lie Arg Asn 
145 150 155 160 

Leu Ala Phe Leu Gly lie Ala Tyr Asn Thr Leu Leu Arg lie Ala Glu 

165 170 175 

He Ala Arg He Arg Val Lys Asp He Ser Arg Thr Asp Gly Gly Arg 

180 185 190 

Met Leu He His He Gly Arg Thr Lys Thr Leu Val Ser Thr Ala Gly 

195 200 205 

Val Glu Lys Ala Leu Ser Leu Gly Val Thr Lys Leu Val Glu Arg Trp 

210 215 220 

He Ser Val Ser Gly Val Ala Asp Asp Pro Asn Asn Tyr Leu Phe Cys 
225 230 235 240 

Arg Val Arg Lys Asn Gly Val Ala Ala Pro Ser Ala Thr Ser Gin Leu 

245 250 255 

Ser Thr Arg Ala Leu Glu Gly He Phe Glu Ala Thr His Arg Leu lie 

260 265 270 

Tyr Gly Ala Lys Asp Asp Ser Gly Gin Arg Tyr Leu Ala Trp Ser Gly 

275 280 285 

His Ser Ala Arg Val Gly Ala Ala Arg Asp Met Ala Arg Ala Gly Val 

290 295 300 

Ser He Pro Glu He Met Gin Ala Gly Gly Trp Thr Asn Val Asn He 
305 310 315 320 

Val Met Asn Tyr He Arg Asn Leu Asp Ser Glu Thr Gly Ala Met Val 

325 330 335 

Arg Leu Leu Glu Asp Gly Asp 
340 

<210> 60 
<211> 1272 
<212> DNA 

<213> Saccharomyces cerevisiae 

<220> 

<221> CDS 

<222> (1) . . . (1272) 

<223> nucleotide sequence encoding Flip recombmase 
<400> 60 

atg cca caa ttt ggt ata tta tgt aaa aca cca cct aag gtg ctt gtt 
Met Pro Gin Phe Gly He Leu Cys Lys Thr Pro Pro Lys Val Leu Val 
1 5 10 15 

cgt cag ttt gtg gaa agg ttt gaa aga cct tea ggt gag aaa ata gca 
Arg Gin Phe Val Glu Arg Phe Glu Arg Pro Ser Gly Glu Lys He Ala 
20 25 30 



48 



96 



-31 

tta tgt get get gaa eta ace tat tta tgt tgg atg att aca cat aac 14 4 

Leu Cys Ala Ala Glu Leu Thr Tyr Leu Cys Trp Met lie Thr His Asn 
35 40 45 



ata 192 

Asn Thr lie 

50 " " " 55 60 



qqa aca gca ate aag aga gee aca ttc atg age tat aat act ate 
Gly Thr Ala lie Lys Arg Ala Thr Phe Met Ser Tyr Asn Thr lie IXe 



age aat teg ctg agt ttc gat att gtc aat aaa tea etc cag ttt aaa 
Ser Asn Ser Leu Ser Phe Asp lie Val Asn Lys Ser Leu Gin Phe Lys 



65 70 



75 80 



tac aag acg caa aaa gca aca att ctg gaa gec tea tta aag aaa ttg 
Tyr Lys Thr Gin Lys Ala Thr lie Leu Glu Ala Ser Leu Lys Lys Leu 
85 90 95 

att cct get tgg gaa ttt aca att att cct tac tat gga caa aaa cat 
lie Pro Ala Trp Glu Phe Thr lie lie Pro Tyr Tyr Gly Gin Lys His 



100 105 110 

caa tct gat ate act gat att gta agt agt ttg caa tta cag ttc gaa 

Gin Ser Asp lie Thr Asp He Val Ser Ser Leu Gin Leu Gin Phe Glu 
115 12 0 125 



aag tat ctg gga gta ata ate cag tgt tta gtg aca gag aca aag 
Lys Tyr Leu Gly Val He He Gin Cys Leu Val Thr Glu Thr Lys Thr 
2io 215 220 

aqc gtt agt agg cac ata tac ttc ttt age gca agg ggt agg ate gat 
Ser Val Ser Arg His He Tyr Phe Phe Ser Ala Arg Gly Arg He Asp 
225 230 235 240 

cca ctt gta tat ttg gat gaa ttt ttg agg aat tct gaa cca gtc eta 
Pro Leu Val Tyr Leu Asp Glu Phe Leu Arg Asn Ser Glu Pro Val Leu 
245 250 255 

aaa cga gta aat agg acc ggc aat tct tea age aat aaa cag gaa tac 
Lys Arg Val Asn Arg Thr Gly Asn Ser Ser Ser Asn Lys Gin Glu Tyr 
260 265 270 

caa tta tta aaa gat aac tta gtc aga teg tac aat aaa get ttg aag 
Gin Leu Leu Lys Asp Asn Leu Val Arg Ser Tyr Asn Lys Ala Leu Lys 
275 280 285 

aaa aat qcg cct tat tea ate ttt get ata aaa aat ggc cca aaa tct 
Lys Asn Ala Pro Tyr Ser He Phe Ala He Lys Asn Gly Pro Lys Ser 
J 295 300 



240 



288 



336 



3 84 



432 



tea teg gaa gaa gca gat aag gga aat age cac agt aaa aaa atg ctt 

Ser Ser Glu Glu Ala Asp Lys Gly Asn Ser His Ser Lys Lys Met Leu 
130 135 140 

aaa gca ctt eta agt gag ggt gaa age ate tgg gag ate act gag aaa 480 

Lys Ala Leu Leu Ser Glu Gly Glu Ser He Trp Glu He Thr Glu Lys 

145 150 155 160 

ata eta aat teg ttt gag tat act teg aga ttt aca aaa aca aaa act 

He Leu Asn Ser Phe Glu Tyr Thr Ser Arg Phe Thr Lys Thr Lys Thr 
165 170 175 

tta tac caa ttc etc ttc eta get act ttc ate aat tgt gga aga ttc 

Leu Tyr Gin Phe Leu Phe Leu Ala Thr Phe He Asn Cys Gly Arg Phe 
180 185 190 

age gat att aag' aac gtt gat ccg aaa tea ttt aaa tta gtc caa aat 
Ser Asp He Lys Asn Val Asp Pro Lys Ser Phe Lys Leu Val Gin Asn 
195 200 205 



528 



576 



624 



672 



720 



768 



816 



864 



912 



290 



cac att gga aga cat 
His He Gly Arg His 
305 

acg gag ttg act aat 
Thr Glu Leu Thr Asn 
325 



gcc gtg gcc agg aca 
Ala Val Ala Arg Thr 
340 

cac tac ttc gca eta 
His Tyr Phe Ala Leu 
355 

aag gaa atg ata gca 
Lys Glu Met He Ala 
370 

cag cat ata gaa cag 
Gin His lie Glu Gin 
385 

ccc gca tgg aat ggg 
Pro Ala Trp Asn Gly 
405 

tec tac ata aat aga 
Ser Tyr lie Asn Arg 
420 



<210> 61 

<211> 422 

<212> PRT 

<213 > Saccharomyces 



<400> 61 



Pro 


Gin 


Phe 


Gly 


lie 


1 








5 


Gin 


Phe 


Val 


Glu 


Arg 








20 




Cys 


Ala 


Ala 


Glu 


Leu 




35 






Thr 


Ala 


lie 


Lys 


Arg 




50 








Aan 


Ser 


Leu 


Ser 


Phe 


65 










Lys 


Thr 


Gin 


Lys 


Ala 










85 


Pro 


Ala 


Trp 


Glu 


Phe 








100 




Ser 


Asp 


lie 


Thr 


Asp 






115 






Ser 


Glu 


Glu 


Ala 


Asp 




13 0 






Ala 


Leu 


Leu 


Ser 


Glu 


145 










Leu 


Asn 


Ser 


Phe 


Glu 










165 


Tyr 


Gin 


Phe 


Leu 


Phe 








180 




Asp 


lie 


Lys 


Asn 


Val 






195 






Tyr 


Leu 


Gly 


Val 


lie 
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ttg 


atg 


acc 


tea 


ttt 


ctt 


tea 


atg 


aag 


ggc 


eta 


Leu 


Met 


Thr 


Ser 


Phe 


Leu 


Ser 


Met 


Lvs 


Gly 


Leu 


310 










315 










320 


gtt 


gtg 


gga 


aat 


tgg 


age 


gat 


aag 


cgt 


get 


tct 


V ai 


V ct-L 


J-y 


Asn 


Trp 


Ser 


Asp 


Lys 




Ala 


Ser 










330 










335 




acg 


tat 


act 


cat 


cag 


ata 


aca 


gca 


ata 


cct 


gat 


*"PVi v 
JL 111. 


Tyr 


Tin- 


His 


Gin 


He 


Thr 


Ala 


He 


Pro 


Asp 






345 










350 






gtt 


tct 


egg 


tac 


tat 


gca 


tat 


gat 


cca 


ata 


tea 


Val 


Ser 


Arg 


Tyr 


Tyr 


Ala 


Tyr 


Asp 


Pro 


He 


Ser 






360 










365 








ttg 


aag 


gat 


gag 


act 


aat 


cca 


att 


gag 


gag 


tgg 


Leu 


Lys 


Asp 


Glu 


Thr 


Asn 


Pro 


He 


Glu 


Glu 


Trp 




375 








380 










eta 


aag 


got 


agt 


get 


gaa 


gga 


age 


ata 


cga 


tac 


Leu 


Lys 


Gly 


Ser 


Ala 


Glu 


Gly 


Ser 


He 


Arg 


Tyr 


390 








395 










400 


ata 


ata 


tea 


cag 


gag 


gta 


eta 


gac 


tac 


ctt 


tea 


He 


He 


Ser 


Gin 


Glu 


Val 


Leu 


Asp 


Tyr 


Leu 


Ser 










410 










415 




cgc 


ata 


taa 


















Arg 


He 


* 


















cerevisiae 


















Leu 


Cys 


Lys 


Thr 


Pro 


Pro 


Lys 


Val 


Leu 


Val 


Arg 








10 










15 




Phe 


Glu 


Arg 


Pro 


Ser 


Gly 


Glu 


Lys 


He 


Ala 


Leu 






25 










30 






Thr 


Tyr 


Leu 


Cys 


Trp 


Met 


He 


Thr 


His 


Asn 


Gly 






40 










45 








Ala 


Thr 


Phe 


Met 


Ser 


Tyr 


Asn 


Thr 


He 


He 


Ser 




55 










60 










Asp 


He 


Val 


Asn 


Lys 


Ser 


Leu 


Gin 


Phe 


Lys 


Tyr 


70 










75 










80 


Thr 


He 


Leu 


Glu 


Ala 


Ser 


Leu 


Lys 


Lys 


Leu 


He 










90 








95 




Thr 


He 


He 


Pro 


Tyr 


Tyr 


Gly 


Gin 


Lys 


His 


Gin 








105 






110 






He 


Val 


Ser 


Ser 


Leu 


Gin 


Leu 


Gin 


Phe 


Glu 


Ser 






120 










125 








Lys 


Gly 


Asn 


Ser 


His 


Ser 


Lys 


Lys 


Met 


Leu 


Lys 




135 










140 










Gly 


Glu 


Ser 


He 


Trp 


Glu 


He 


Thr 


Glu 


Lys 


He 


150 










155 










160 


Tyr 


Thr 


Ser 


Arg 


Phe 


Thr 


Lys 


Thr 


Lys 


Thr 


Leu 








170 










175 




Levi 


Ala 


Thr 


Phe 


He 


Asn 


Cys 


Gly 


Arg 


Phe 


Ser 








185 








190 






Asp 


Pro 


Lys 


Ser 


Phe 


Lys 


Leu 


Val 


Gin 


Asn 


Lys 




200 










205 








He 


Gin 


Cys 


Leu 


Val 


Thr 


Glu 


Thr 


Lys 


Thr 


Ser 
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210 










215 










220 










Val 


Sear 


Arg 


His 


lie 


Tyr 


Phe 


Phe 


Ser 


Ala 


Arg 


Gly 


Arg 


He 


Asp 


Pro 


225 








230 










235 










240 


Leu 


Val 


Tyr 


Leu 


Asp 


Glu 


Phe 


Leu 


Arg 


Asn 


Ser 


Glu 


Pro 


Val 


Leu 


Lys 








245 










250 










255 




Arg 


Val 


As ri 


Arg 


Thr 


Gly 


Asn 


Ser 


Ser 


Ser 


Asn 


Lys 


Gin 


Glu 


Tyr 


Gin 






260 








265 










270 






Leu 


Leu 


Lys 


Asp 


Asn 


Leu 


Val 


Arg 


Ser 


Tyr 


Asn 


Lys 


Ala 


Leu 


Lys 


Lys 






275 








280 










285 






His 


Asn 


Ala 


Pro 


Tyr 


Ser 


lie 


Phe 


Ala 


He 


Lys 


Asn 


Gly 


Pro 


Lys 


Ser 




290 








295 










300 










lie 


Gly 


Arg 


His 


Leu 


Met 


Thr 


Ser 


Phe 


Leu 


Ser 


Met 


Lys 


Gly 


Leu 


Thr 


305 






310 










315 










320 


Glu 


Leu 


Thr 


Asn 


Val 


Val 


Gly 


Asn 


Trp 


Ser 


Asp 


Lys 


Arg 


Ala 


Ser 


Ala 










325 








330 










335 




Val 


Ala 


Arg 


Thr 


Thr 


Tyr 


Thr 


His 


Gin 


He 


Thr 


Ala 


He 


Pro 


Asp 


His 






340 










345 










350 






Tyr 


Phe 


Ala 


Leu 


Val 


Ser 


Arg 


Tyr 


Tyr 


Ala 


Tyr 


Asp 


Pro 


He 


Ser 


Lys 




355 










360 










365 








Glu 


Met 
370 


He 


Ala 


Leu 


Lys 


Asp 
375 


Glu 


Thr 


Asn 


Pro 


He 
380 


Glu 


Glu 


Trp 


Gin 


His 


lie 


Glu 


Gin 


Leu 


Lys 


Gly 


Ser 


Ala 


Glu 


Gly 


Ser 


He 


Arg 


Tyr 


Pro 


385 










390 








395 










400 


Ala 


Trp 


Asn 


Gly 


He 


lie 


Ser 


Gin 


Glu 


Val 


Leu 


Asp 


Tyr 


Leu 


Ser 


Ser 






405 










410 










415 




Tyr 


He 


Asn 


Arg 
420 


Arg 


He 























<210> 62 
<211> 48 
<212> DNA 

<213> Artificial Sequence 

<220> 
<223> IR2 

<400> 62 

gaagttccta ttccgaagtt cctattctct agaaagtata ggaacttc 4 8 

<210> 63 

<211> 48 

<212> DNA 

<213> Artificial Sequence 

<220> 
<223> IR1 

<400> 63 

gaagttccta tactttctag agaataggaa cttcggaata ggaacttc 4 8 

<210> 64 
<211> 66 
<212> DNA 

<213> Bacteriophage mu 

<220> 

<221> CDS 

<222> (1) . . . (66) 

<22 3> nucleotide sequence encoding GIN recombinase 
<400> 64 

tea act ctg tat aaa aaa cac ccc gcg aaa cga gcg cat ata gaa aac 4 8 

Ser Thr Leu Tyr Lys Lys His Pro Ala Lys Arg Ala His He Glu Asn 
15 10 15 



gac gat cga ate aat taa 
Asp Asp Arg He Asn * 



66 
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20 

<210> 65 
<211> 21 
<212> PRT 

<213> bacteriophage mu 
<400> 65 

Ser Thr Leu Tyr Lys Lys His Pro Ala Lys Arg Ala His lie Glu Aen 

15 10 15 

Asp Asp Arg He Asn 
20 

<210> 66 
<211> 69 
<212> DNA 

<213> Bacteriophage mu 

<220> 

<221> CDS 

<222> (1) . . . (69) 

<223> nucleotide sequence encoding Gin recombinase 
<400> 66 

tat aaa aaa cat ccc gcg aaa cga acg cat ata gaa aac gac gat cga 4 8 

Tyr Lys Lys His Pro Ala Lys Arg Thar His He Glu Asn Asp Asp Arg 
1 5 10 15 

ate aat caa ate gat egg taa 69 
lie Asn Gin lie Asp Arg * 
20 

<210> 67 
<211> 22 
<212> PRT 

<213> bacteriophage mu 
<220> 

<223> Gin recombinase of bacteriophage mu 
<400> 67 

Tyr Lys Lys His Pro Ala Lys Arg Thr His He Glu Asn Asp Asp Arg 

15 10 15 

lie Asn Gin lie Asp Arg 
20 

<210> 68 
<211> 555 
<212> DNA 

<213> Escherichia coli 

<220> 

<221> CDS 

<222> (1) . . . (555) 

<22 3> nucleotide sequence encoding PIN recombinase 
<400> 68 

atg ctfc att ggc tat gta cgc gta tea aca aat gac cag aac aca gat 4 8 

Met Leu lie Gly Tyr Val Arg Val Ser Thr Asn Asp Gin Asn Thr Asp 
15 10 15 



eta caa cgt aat gcg ctg aac tgt gca gga tgc gag ctg att ttt gaa 
Leu Gin Arg Asn Ala Leu Asn Cys Ala Gly Cys Glu Leu He Phe Glu 
20 25 30 



96 
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gac aag ata age ggc aca aag tec gaa agg ccg gga ctg aaa aaa ctg 144 
Asp Lys lie Ser Gly Thr Lys Ser Glu Arg Pro Gly Leu Lys Lys Leu 
35 40 45 

etc agg aca tta teg gca ggt gac act ctg gtt gtc tgg aag ctg gat 192 
Leu Arg Thr Leu Ser Ala Gly Asp Thr Leu Val Val Trp Lys Leu Asp 
50 55 60 

egg ctg ggg cgt agt atg egg cat ctt gtc gtg ctg gtg gag gag ttg 24 0 

Arg Leu Gly Arg Ser Met Arg His Leu Val Val Leu Val Glu Glu Leu 
65 70 75 80 

cgc gaa cga ggc ate aac ttt cgt agt ctg acg gat tea att gat ace 2 88 

Arg Glu Arg Gly lie Asn Phe Arg Ser Leu Thr Asp Ser lie Asp Thr 
85 90 95 

age aca cca atg gga cgc ttt ttc ttt cat gtg atg ggt gee ctg get 3 36 

Ser Thr Pro Met Gly Arg Phe Phe Phe His Val Met Gly Ala Leu Ala 
100 105 110 

gaa atg gag cgt gaa ctg att gtt gaa cga aca aaa get gga ctg gaa 3 84 

Glu Met Glu Arg Glu Leu lie Val Glu Arg Thr Lys Ala Gly Leu Glu 
115 120 125 

act get cgt gca cag gga cga att ggt gga cgt cgt ccc aaa ctt aca 432 
Thr Ala Arg Ala Gin Gly Arg lie Gly Gly Arg Arg Pro Lys Leu Thr 
130 135 140 

cca gaa caa tgg gca caa get gga cga tta att gca gca gga act cct 4 80 

Pro Glu Gin Trp Ala Gin Ala Gly Arg Leu lie Ala Ala Gly Thr Pro 
145 150 155 ~ 160 

cgc cag aag gtg gcg att ate tat gat gtt ggt gtg tea act ttg tat 528 
Arg Gin Lys Val Ala lie lie Tyr Asp Val Gly Val Ser Thr Leu Tyr 
165 170 175 

aag agg ttt cct gca ggg gat aaa taa 555 
Lys Arg Phe Pro Ala Gly Asp Lys * 
180 

<210> 69 
<211> 184 
<212> PRT 

<213> Escherichia coli 
<400> 69 

Met Leu lie Gly Tyr Val Arg Val Ser Thr Asn Asp Gin Asn Thr Asp 

1 5 10 15 

Leu Gin Arg Asn Ala Leu Asn Cys Ala Gly Cys Glu Leu lie Phe Glu 

2 0 25 30 

Asp Lys lie Ser Gly Thr Lys Ser Glu Arg Pro Gly Leu Lys Lys Leu 

35 40 45 

Leu Arg Thr Leu Ser Ala Gly Asp Thr Leu Val Val Trp Lys Leu Asp 

50 55 6 0 

Arg Leu Gly Arg Ser Met Arg His Leu Val Val Leu Val Glu Glu Leu 
65 70 75 80 

Arg Glu Arg Gly lie Asn Phe Arg Ser Leu Thr Asp Ser lie Asp Thr 

85 90 95 

Ser Thr Pro Met Gly Arg Phe Phe Phe His Val Met Gly Ala Leu Ala 

100 105 110 

Glu Met Glu Arg Glu Leu lie Val Glu Arg Thr Lys Ala Gly Leu Glu 

115 120 125 

Thr Ala Arg Ala Gin Gly Arg lie Gly Gly Arg Arg Pro Lys Leu Thr 

130 135 140 

Pro Glu Gin Trp Ala Gin Ala Gly Arg Leu lie Ala Ala Gly Thr Pro 
145 150 155 160 
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Arg Gin Lys Val Ala lie He Tyr Asp Val Gly Val Ser Thr Leu Tyr 

165 170 175 

Lys Arg Phe Pro Ala Gly Asp Lys 
180 

<210> 70 
<211> 4778 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> pcx plasmid 



<400> 70 

gtcgacattg 

gcccatatat 

ccaacgaccc 

ggactttcca 

atcaagtgta 

cctggcatta 

tattagtcat 

atctcccccc 

gcgatggggg 

gggcggggcg 

tccttttatg 

gggagtcgct 

ccggctctga 

gggctgtaat 

ccttaaaggg 

tgtgtgtgtg 

cgggcgcggc 

ggtgccccgc 

fcgggggggtg 

cctccccgag 

gcggggctcg 

ccgcctcggg 
gtcgaggcgc 
gacttccttt 
tagcgggcgc 
cgtgcgtcgc 
acggctgcct 
gctctagagc 
acgtgctggt 
ctgcctatca 
atctttt tec 
ctggctaata 
cacteggaag 
agagtttggc 
catcagtata 
tgaggttaga 
ttttccttac 
gctgtccctc 
gtcatagctg 
eggaagcata 
gttgegctea 
aattagtcag 
agttccgccc 
gccgcctcgg 
ttttgcaaaa 
tcacaaattt 
tcatcaatgt 
gagaggeggt 
ggtcgttcgg 
agaa t caggg 
ccgtaaaaag 
caaaaatcga 
gtttccccct 



attattgact 
ggagttccgc 
ccgcccattg 
ttgacgtcaa 
teatatgeca 
tgcccagtac 
cgctattacc 
cctccccacc 
eggggggggg 
aggeggagag 
gegaggegge 
gcgttgcctt 
ctgaccgcgt 
tagegcttgg 
etcegggagg 
cgtggggagc 
gcggggcttt 
ggtgcggggg 
agcagggggt 
ttgetgagea 
ccgtgccggg 
ccggggaggg 
ggcgagccgc 
gtcccaaatc 
gggcgaagcg 
cgcgccgccg 
teggggggga 
ctctgctaac 
tgttgtgctg 
gaaggtggtg 
ctctgccaaa 
aaggaaattt 
gacatatggg 
aacatatgee 
tgaaacagee 
fctttttttat 
atgttttact 
ttctcttatg 
tttcctgtgt 
aagtgtaaag 
ctgcccgctt 
caaccatagt 
attctccgcc 
cctctgagct 
agctaacttg 
cacaaataaa 
atcttatcat 
ttgcgtafctg 
ctgeggegag 
gataaegcag 
gccgcgttgc 
cgctcaagtc 
ggaagctccc 



agttattaat 
gttacataac 
aegtcaataa 
tgggtggact 
agtacgcccc 
atgaccttat 
atgggtcgag 
cccaattttg 

gggggegege 

gtgeggegge 
ggcggcggcg 
cgccccgtgc 
tactcccaca 
tttaatgacg 
gccctttgtg 
gccgcgtgcg 
gtgcgctccg 
ggctgegagg 

gtgggegcgg 
cggcccggct 
cggggggtgg 
ctegggggag 
agccattgcc 
tggeggagee 
gtgcggcgcc 
tccccttctc 
eggggcaggg 
catgttcatg 
tctcatcatt 
gctggtgtgg 
aattatgggg 
attttcattg 
agggcaaatc 
atatgetgge 
ccctgctgtc 
attttgtttt 
agecagattt 
aagatccctc 
gaaattgtta 
cctggggtgc 
tecagteggg 
cccgccccta 
ccatggctga 
a 1 1 c c agaag 
tttattgcag 
gcattttttt 
gtctggatcc 
ggcgctcttc 
eggtatcage 
gaaagaacat 

tggcgttttt 
agaggtggcg 
tcgtgcgctc 



agtaatcaat 
ttacggtaaa 
tgacgtatgt 
atttaeggta 
etattgaegt 
gggactttcc 
gtgagcccca 
tatttattta 
gecaggeggg 
agecaatcag 
gecctataaa 
cccgctccgc 
ggtgagcggg 
getegtttet 
egggggggag 
gcccgcgctg 
cgtgtgcgcg 
ggaacaaagg 
eggteggget 
tegggtgegg 
cggcaggtgg 
gggcgcggcg 
ttttatggta 
gaaatctggg 
ggcaggaagg 
catctccagc 
eggggttegg 
ccttcttctt 
ttggcaaaga 
ccaatgccct 
acatcatgaa 
caatagtgtg 
atttaaaaca 
tgecatgaac 
cattccttat 
gtgttatttt 
ttcctcctct 
gacctgcagc 
tccgctcaca 
ctaatgagtg 
aaacctgtcg 
actccgccca 
ctaatttttt 
tagtgaggag 
cttataatgg 
cactgcattc 
getgeattaa 
cgcttcctcg 
tcactcaaag 

gtgagcaaaa 

ccataggctc 
aaacccgaca 
tcctgttccg 



taeggggtea 
tggcccgcct 
tcccatagta 
aactgcccac 
caatgaeggt 
tacttggcag 
cgttctgctt 
ttttttaatt 
gcggggcggg 
agcggcgcgc 
aagcgaagcg 
gccgcctcgc 
cgggacggcc 
tttctgtggc 
eggctegggg 
cc eggegge t 

aggggagege 

ctgcgtgcgg 
gtaacccccc 
ggctccgtgc 
gggtgccggg 
gccccggagc 
ategtgegag 
aggcgccgcc 
aaatgggcgg 
cteggggctg 
cttctggcgt 
tttcctacag 
attcactcct 
ggctcacaaa 
gccccttgag 
ttggaatttt 
tcagaatgag 
aaaggtggct 
tccatagaaa 
tttctttaac 
cctgactact 
ccaagcttgg 
attccacaca 
agctaactca 
tgccagcgga 
tcccgcccct 
ttatttatgc 
gcttttttgg 
ttacaaataa 
tagttgtggt 
tgaateggee 
ctcactgact 
gcggfcaatac 
ggccagcaaa 
cgcccccctg 
ggactataaa 
accctgccgc 



ttagttcata 
ggctgaccgc 
aegecaatag 
ttggcagtac 
aaatggcccg 
tacatctacg 
cactctcccc 
attttgtgca 
gcgaggggcg 
tccgaaagtt 
cgcggcgggc 
gccgcccgcc 
cttctcctcc 
tgcgtgaaag 
ggtgcgtgcg 
gtgagcgctg 
ggceggggge 
ggtgtgtgcg 

cctgcacccc 
ggggcgtggc 
eggggegggg 
gccggcggct 

agggegcagg 

gcaccccctc 

ggagggcett 

ccgcaggggg 
gtgaccggcg 
ctcctgggca 
caggtgeagg 
taccactgag 
catctgactt 
ttgtgtctct 
tatttggttt 
ataaagaggt 
agecttgact 
atccctaaaa 
cccagtcata 
cgtaatcatg 
acatacgagc 
cattaattgc 
tccgcatctc 
aactccgccc 
agaggecgag 
aggectagge 
agcaatagca 
ttgtccaaac 
aacgegeggg 
cgctgcgctc 
ggttatccac 
aggecaggaa 
acgagcatca 
gataccaggc 
ttaceggata 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
I860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
270O 
2760 
2820 
2880 
2940 
3000 
3060 
312 0 
3180 



cctgtccgcc 
tctcagttcg 
gcccgaccgc 
cttatcgcca 
tgctacagag 
tatctgcgct 
caaacaaacc 
aaaaaaagga 
cgaaaactca 
ccttttaaat 
tgacagttac 
atccatagtt 
tggccccagt 
aataaaccag 
catccagtct 
gcgcaacgtt 
ttcattcagc 
aaaagcggtt 
atcactcatg 
cttttctgtg 
gagttgctct 
agtgctcatc 
gagatccagt 
caccagcgtt 
ggcgacacgg 
tcagggttat 
aggggttccg 



tttctccctt 
gtgtaggtcg 
tgcgccttat 
ctggcagcag 
ttcttgaagt 
ctgctgaagc 
accgctggta 
tctcaagaag 
cgttaaggga 
taaaaatgaa 
caatgcttaa 
gcctgactcc 
gctgcaatga 
ccagccggaa 
attaattgtt 
gttgccattg 
tccggttccc 
agctccttcg 
gttatggcag 
actggtgagt 
tgcccggcgt 
attggaaaac 
tcgatgtaac 
tctgggtgag 
aaatgttgaa 
tgtctcatga 
cgcacatttc 



cgggaagcgt 
ttcgctccaa 
ccggtaacta 
ccactggtaa 
ggtggcctaa 
cagttacctt 
gcggtggttt 
atcctttgat 
ttttggtcat 
gttttaaatc 
tcagtgaggc 
ccgtcgtgta 
taccgcgaga 
gggccgagcg 
gccgggaagc 
ctacaggcat 
aacgatcaag 
gtcctccgat 
cactgcataa 
actcaaccaa 
caatacggga 
gttcttcggg 
ccactcgtgc 
caaaaacagg 
tactcatact 
gcggatacat 
cccgaaaagt 
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ggcgctttct 
gctgggctgt 
tcgtcttgag 
caggattagc 
ctacggctac 
cggaaaaaga 
ttttgtttgc 
cttttctacg 
gagattatca 
aatctaaagt 
acctatctca 
gataactacg 
cccacgctca 
cagaagtggt 
tagagtaagt 
cgtggtgtca 
gcgagttaca 
cgttgtcaga 
ttctcttact 
gtcattctga 
taataccgcg 
gcgaaaactc 
acccaactga 
aaggcaaaat 
cttccttttt 
atttgaatgt 
gccacctg 



caatgctcac 
gtgcacgaac 
tccaacccgg 
agagcgaggt 
act agaagga 
gttggtagct 
aagcagcaga 
gggtctgacg 
aaaaggat c t 
atatatgagt 
gcgatctgtc 
atacgggagg 
ccggctccag 
cctgcaactt 
agttcgccag 
cgctcgtcgt 
tgatccccca 
agtaagttgg 
gtcatgccat 
gaatagtgta 
ccacatagca 
tcaaggatct 
tcttcagcat 
gccgcaaaaa 
caatattatt 
atttagaaaa 



gctgtaggta 
cccccgttca 
taagacacga 
atgtaggcgg 
cagtatttgg 
cttgatccgg 
ttacgcgcag 
ctcagtggaa 
tcacctagat 
aaacttggtc 
tatttcgttc 
gcttaccatc 
atttatcagc 
tatccgcctc 
ttaatagttt 
ttggtatggc 
tgttgtgcaa 
ccgcagtgtt 
ccgtaagatg 
tgcggcgacc 
gaactttaaa 
taccgctgtt 
cttttacttt 
agggaat aag 
gaagcattta 
ataaacaaat 



3240 
3300 
3360 
3420 
34S0 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4778 



<210> 71 

<211> 5510 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pCXeGFP plasmid 



<400> 71 

gtcgacattg 

gcccatatat 

ccaacgaccc 

ggactt tcca 

atcaagtgta 

cctggcatta 

tattagtcat 

atctcccccc 

gcgatggggg 

gggcggggcg 

tccttttatg 

gggagtcgct 

ccggctctga 

gggctgtaat 

ccttaaaggg 

tgtgtgtgtg 

cgggcgcggc 

ggtgccccgc 

tgggggggtg 

cctccccgag 

gcggggctcg 

ccgcctcggg 

gtcgaggcgc 

gacttccttt 

tagcgggcgc 

cgtgcgtcgc 

acggctgcct 

gctctagagc 

acgtgctggt 

agggcgagga 



attattgact 
ggagttccgc 
ccgcccattg 
ttgacgtcaa 
tcatatgcca 
tgcccagtac 
cgctattacc 
cctccccacc 
cggggggggg 
aggcggagag 
gcgaggcggc 
gcgttgcctt 
ctgaccgcgt 
tagcgcttgg 
ctccgggagg 
cgtggggagc 
gcggggcttt 
ggtgcggggg 
agcagggggt 
ttgctgagca 
ccgtgccggg 
ccggggaggg 
ggcgagccgc 
gtcccaaatc 
gggcgaagcg 
cgcgccgccg 
tcggggggga 
ctctgctaac 
tgttgtgctg 
gctgttcacc 



agttattaat 
gttacataac 
acgtcaataa 
tgggtggact 
agtacgcccc 
atgaccttat 
atgggtcgag 
cccaattttg 

gggggcgcgc 

gtgcggcggc 
ggcggcggcg 
cgccccgtgc 
tactcccaca 
tttaatgacg 
gccctttgtg 
gccgcgtgcg 
gtgcgctccg 
ggctgcgagg 
gtgggcgcgg 
cggcccggct 
cggggggtgg 
ctcgggggag 
agccattgcc 
tggcggagcc 
gtgcggcgcc 
tccccttctc 
cggggcaggg 
catgttcatg 
tctcatcatt 
ggggtggtgc 



agtaatcaat 
ttacggtaaa 
tgacgtatgt 
atttacggta 
ctattgacgt 
gggactttcc 
gtgagcccca 
tatttattta 
gccaggcggg 
agccaatcag 
gccctataaa 
cccgctccgc 
ggtgagcggg 
gctcgtttct 
cgggggggag 
gcccgcgctg 
cgtgtgcgcg 
ggaacaaagg 
cggtcgggct 
tcgggtgcgg 
cggcaggtgg 
gggcgcggcg 
ttttatggta 
gaaatctggg 
ggcaggaagg 
catctccagc 
cggggttcgg 
ccttcttctt 
ttggcaaaga 
ccatcctggt 



tacggggtca 
tggcccgcct 
tcccatagta 
aactgcccac 
caatgacggt 
tacttggcag 
cgttctgctt 
ttttttaatt 
gcggggcggg 
agcggcgcgc 
aagcgaagcg 
gccgcctcgc 
cgggacggcc 
tttctgtggc 
cggctcgggg 
cccggcggct 
aggggagcgc 
ctgcgtgcgg 
gtaacccccc 
ggctccgtgc 

gggtgccggg 

gccccggagc 
at eg t gcgag 
aggcgccgcc 
aaatgggcgg 
cteggggctg 
cttctggcgt 
tttcctacag 
attcgccacc 
cgagctggac 



ttagttcata 
ggctgaccgc 
aegecaatag 
ttggcagtac 
aaatggcccg 
tacatctacg 
cactctcccc 
attttgtgca 
gcgaggggcg 
tccgaaagtt 
cgcggcgggc 
gccgcccgcc 
cttctcctcc 
tgcgtgaaag 
ggtgcgtgcg 
gtgagcgctg 
ggceggggge 
ggtgtgtgcg 
cctgcacccc 

ggggcgtggc 

eggggegggg 
gccggcggct 
agggegcagg 
gcaccccctc 
ggagggcett 
ccgcaggggg 
gtgaccggcg 
ctcctgggca 
atggtgagca 
ggcgacgtaa 



60 
12 0 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 



-38- 



acggccacaa 
ccctgaagtt 
ccctgaccta 
tcttcaagtc 
acggcaacta 
tcgagctgaa 
acaactacaa 
tgaacttcaa 
agcagaacac 
cccagtccgc 
tcgtgaccgc 
ctcaggtgca 
aataccactg 
agcatctgac 
ttttgtgtct 
agtatttggt 
ctataaagag 
aaagccttga 
acatccctaa 
ctcccagtca 
ggcgtaatca 
caacatacga 
cacattaatt 
gatccgcatc 
ctaactccgc 
gcagaggccg 
ggaggcctag 
aaagcaatag 
gtttgtccaa 
ccaacgcgcg 
ctcgctgcgc 
acggttatcc 
aaaggccagg 
tgacgagcat 
aagataccag 
gcttaccgga 
acgctgtagg 
accccccgtt 
ggtaagacac 
gtatgtaggc 
gacagtattt 
ctcttgatcc 
gattacgcgc 
cgctcagtgg 
cttcacctag 
gtaaacttgg 
tctatttcgt 
gggcttacca 
agatttatca 
tttatccgcc 
agttaatagt 
gtttggtatg 
catgttgtgc 
ggccgcagtg 
atccgtaaga 
tatgcggcga 
cagaacttta 
cttaccgctg 
atcttttact 
aaagggaata 
ttgaagcatt 
aaataaacaa 



gttcagcgtg 
catctgcacc 
cggcgtgcag 
cgccatgccc 
caagacccgc 
gggcatcgac 
cagccacaac 
gatccgccac 
ccccatcggc 
cctgagcaaa 
cgccgggatc 
ggctgcctat 
agatcttttt 
ttctggctaa 
ctcactcgga 
ttagagtttg 
gtcatcagta 
cttgaggtta 
aattttcctt 
tagctgtccc 
tggtcatagc 
gccggaagca 
gcgttgcgct 
tcaattagtc 
ccagttccgc 
aggccgcctc 
gcttttgcaa 
catcacaaat 
actcatcaat 
gggagaggcg 
tcggtcgttc 
acagaatcag 
aaecgtaaaa 
cacaaaaatc 
gcgtttcccc 
tacctgtccg 
tatctcagtt 
cagcccgacc 
gacttatcgc 
ggtgctacag 
ggtatctgcg 
ggcaaacaaa 
agaaaaaaag 
aacgaaaact 
atccttttaa 
tctgacagtt 
tcatccatag 
tctggcccca 
gcaataaacc 
tccatccagt 
ttgcgcaacg 
gcttcattca 
aaaaaagcgg 
ttatcactca 
tgcttttctg 
ccgagttgct 
aaagtgctca 
ttgagatcca 
ttcaccagcg 
agggcgacac 
tatcagggtt 
ataggggttc 



tccggcgagg 
accggcaagc 
tgcttcagcc 
gaaggctacg 
gccgaggtga 
ttcaaggagg 
gtctatatca 
aacatcgagg 
gacggccccg 
gaccccaacg 
actctcggca 
cagaaggtgg 
ccctctgcca 
taaaggaaat 
aggacatatg 
gcaacatatg 
tatgaaacag 
gatttttttt 
acatgtttta 
tcttctctta 
tgtttcctgt 
taaagtgtaa 
cactgcccgc 
agcaaccata 
ccattctccg 
ggcctctgag 
aaagctaact 
ttcacaaata 
gtatcttatc 
gtttgcgtat 
ggctgcggcg 
gggataacgc 
aggccgcgtt 
gacgctcaag 
ctggaagctc 
cctttctccc 

cggtgtaggt 

gctgcgcctt 
cactggcagc 
agttcttgaa 
ctctgctgaa 
ccaccgctgg 
gatctcaaga 
cacgttaagg 
attaaaaatg 
accaatgctt 
ttgcctgact 
gtgctgcaat 
age cage egg 
ctattaattg 
ttgttgccat 
gctccggttc 
ttagctcctt 
tggttatggc 
tgactggtga 
cttgcccggc 
tcattggaaa 
gttcgatgta 
tttctgggtg 
ggaaatgttg 
attgtctcat 
cgcgcacatt 



gegagggega 
tgcccgtgcc 
gctaccccga 
tecaggageg 
agttcgaggg 
aeggcaacat 
tggccgacaa 
aeggcagegt 
tgctgctgcc 
agaagegega 
tggacgagct 
tggctggtgt 
aaaattatgg 
ttattttcat 
ggagggcaaa 
ecatatgetg 
ccccctgctg 
atattttgtt 
etagecagat 
tgaagatccc 
gtgaaattgt 
agcctggggt 
tttccagtcg 
gtcccgcccc 
ccccatggct 
ctattccaga 
tgtttattgc 
aagcattttt 
atgtctggat 

tgggegctet 

ageggtatea 
aggaaagaac 
gctggcgttt 
tcagaggtgg 
cctcgtgcgc 
ttegggaage 
cgttcgctcc 
ateeggtaac 
agccactggt 
gtggtggcct 
gccagttacc 
tagcggtggt 
agatcctttg 
gattttggtc 
aagttttaaa 
aatcagtgag 
ccccgtcgtg 
gataccgega 
aagggecgag 
ttgccgggaa 
tgetacagge 
ccaacgatca 
cggtcctccg 
ageactgeat 
gtactcaacc 
gtcaataegg 
aegttctteg 
acccactcgt 
agcaaaaaca 
aatactcata 
gageggatae 
tccccgaaaa 



tgccacctac 
ctggcccacc 
ccacatgaag 
caccatcttc 
cgacaccctg 
cctggggcac 
gcagaagaac 
gcagctcgcc 
cgacaaccac 
tcacatggtc 
gtacaagtaa 
ggccaatgcc 
ggacatcatg 
tgcaatagtg 
tcatttaaaa 
getgecatga 
tccattcctt 
ttgtgttatt 
ttttcctcct 
tcgacctgca 
tat ccgctca 
gectaatgag 
ggaaacctgt 
taactccgcc 
gactaatttt 
agtagtgagg 
agcttataat 
ttcactgeat 
ccgctgcatt 
tccgcttcct 
gctcactcaa 
atgtgagcaa 
ttccataggc 
cgaaacccga 
tctcctgttc 
gtggcgcttt 
aagctgggct 
tategtcttg 
aacaggat ta 
aactaegget 
ttcggaaaaa 
ttttttgttt 
atcttttcta 
atgagattat 
tcaatctaaa 
gcacctatct 
tagataacta 
gacccacgct 
cgcagaagtg 
gctagagtaa 
at cgtggtgt 
aggegagtta 
atcgttgtca 
aattctctta 
aagtcattct 
gataataccg 
gggegaaaac 
gcacccaact 
ggaaggca aa 
ctcttccttt 
atatttgaat 
gtgccacctg 



ggcaagctga 
ctcgtgacca 
cagcacgact 
ttcaaggacg 
gtgaaccgea 
aagctggagt 
ggcatcaagg 
gaccactacc 
tacctgagca 
ctgctggagt 
gaattcactc 
ctggctcaca 
aagccccttg 
tgttggaatt 
catcagaatg 
acaaaggtgg 
attccataga 
tttttcttta 
ctcctgacta 
gcccaagctt 
caattccaca 
tgagctaact 
cgtgccagcg 
catcccgccc 
ttttatttat 
aggctttttt 
ggttacaaat 
tctagttgtg 
aatgaategg 
cgctcactga 
aggeggtaat 
aaggecagea 
tccgcccccc 
caggactata 
cgaccctgcc 
ctcaatgctc 
gtgtgcacga 
agtccaaccc 
geagagegag 
acactagaag 
gagttggtag 
gcaagcagca 
eggggtctga 
caaaaaggat 
gtatatatga 
cagegatctg 
egataeggga 
caccggctcc 
gtcctgcaac 
gtagttcgcc 
cacgctcgtc 
catgatcccc 
gaagtaagtt 
ctgtcatgcc 
gagaatagtg 
cgccacatag 
tctcaaggat 
gatcttcagc 
atgccgcaaa 
ttcaatatta 
gtatttagaa 



1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 
5100 
5160 
5220 
5280 
5340 
5400 
5460 
5510 



<210> 72 
<211> 282 
<212> DNA 

<213> Artificial Sequence 




-39- 



<220> 

<223> attp 
<400> 72 

ccttgcgcta atgctctgtt acaggtcact aataccatct aagtagttga ttcatagtga 
ctgcatatgt tgtgttttac agfcafctatgt agtctgtttt ttatgcaaaa tctaatttaa 
tatattgata tttatatcat tttacgtttc tcgttcagct tttttatact aagttggcat 
tataaaaaag cattgcttat caatttgttg caacgaacag gtcactatca gtcaaaataa 
aatcattatt tgatttcaat tttgtcccac tccctgcctc tg 

<210> 73 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<22 3> Primer 
<400> 73 

ggccccgtaa tgcagaagaa 2 0 

<210> 74 
<211> -32 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 
<400> 74 

ggtttaaagt gcgctcctcc aagaacgtca tc 32 

<210> 75 

<211> 40 

<212> DNA 

<213> Artificial Sequence 
<220> 

<22 3> Primer 

<400> 75 

agate tagag ccgccgctac aggaacaggt ggtggcggcc 40 

<210> 76 

<211> 37 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 5PacSV4 0 

<400> 76 

ctgttaatta actgtggaat gtgtgtcagt tagggtg 37 

<210> 77 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer Ant i sense Zeo 
<400> 77 

tgaacagggt cacgtcgtcc 20 



60 
120 
180 
240 
282 



<210> 78 
<211s> 24 



-40- 



<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 5' HETS 
<400> 7B 

gggccgaaac gatctcaacc tatt 

<210> 79 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 3' HETS 
<400> 79 

cgcagcggcc ctcctactc 

<210> 80 

<211> 29 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 5 BSD 

<400> 80 

accatgaaaa catttaacat ttctcaaca 

<210> 81 
<211> 29 
<212> DNA 

<213> Artificial Sequence 
<220> 

<22 3> Primer SV40polyA 
<400> 81 

tttatttgtg aaatttgtga tgctattgc 

<210> 82 

<211> 25 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 3BSP 

<400> 82 

ttaatttcgg gtatatttga gtgga 



<210> 83 
<211> 32 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer EP05XBA 
<400> 83 

tatctagaat gggggtgcac gaatgtcctg cc 



32 



<210> 84 
<211> 32 
<212> DNA 



-41- 

<213> Artificial Sequence 
<220> 

<223> Primer EP03SBI 
<400> 84 

tacgtacgtc atctgtcccc tgtcctgcag gc 32 

<210> 85 
<211> 27 
<212> DNA 

<213> Artificial Sequence 
<220> 

<2 23> Primer GENEP03BSI 
<400> 85 

cgtacgtcat ctgtcccctg tcctgca 27 

<210> 86 
<211> 28 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer GENEP05XBA 
<400> 86 

tctagaatgg gggtgcacgg tgagtact 2 8 

<210> 87 
<211> 4862 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pD2eGFP-lN plasmid from Clontech 
<400> 87 

tagttattaa tagtaatcaa ttacggggtc attagttcat agcccatata tggagttccg 60 

cgttacataa cttacggtaa atggcccgcc tggctgaccg cccaacgacc cccgcccatt 12 0 

gacgtcaata atgacgtatg ttcccatagt aacgccaata gggactttcc attgacgtca 180 

atgggtggag tatttacggt aaactgccca cttggcagta catcaagtgt atcatatgcc 24 0 

aagtacgccc cctattgacg tcaatgacgg taaatggccc gcctggcatt atgcccagta 300 

catgacctta tgggactttc ctacttggca gtacatctac gtattagtca tcgctattac 360 

catggtgatg cggttttggc agtacatcaa tgggcgtgga tagcggtttg actcacgggg 42 0 

atttccaagt ctccacccca ttgacgtcaa tgggagtttg ttttggcacc aaaatcaacg 480 

ggactttcca aaatgtcgta acaactccgc cccattgacg caaatgggcg gtaggcgtgt 54 0 

acggtgggag gtctatataa gcagagctgg tttagtgaac cgtcagatcc gctagcgcta 6O0 

ccggactcag atctcgagct caagcttcga attctgcagt cgacggtacc gcgggcccgg 66 0 

gatccaccgg tcgccaccat ggtgagcaag ggcgaggagc tgttcaccgg ggtggtgccc 72 0 

atcctggtcg agctggacgg cgacgtaaac ggccacaagt tcagcgtgtc cggcgagggc 78 0 

gagggcgatg ccacctacgg caagctgacc ctgaagttca tctgcaccac cggcaagctg 840 

cccgtgccct ggcccaccct cgtgaccacc ctgacctacg gcgtgcagtg cttcagccgc 90 0 

taccccgacc acatgaagca gcacgacttc ttcaagtccg ccatgcccga aggctacgtc 960 

caggagcgca ccatcttctt caaggacgac ggcaactaca agacccgcgc cgaggtgaag 102 0 

ttcgagggcg acaccctggt gaaccgcatc gagctgaagg gcatcgactt caaggaggac 1080 

ggcaacatcc tggggcacaa gctggagtac aactacaaca gccacaacgt ctatatcatg 114 0 

gccgacaagc agaagaacgg catcaaggtg aacttcaaga tccgccacaa catcgaggac 12 0 0 

ggcagcgtgc agctcgccga ccactaccag cagaacaccc ccatcggcga cggccccgtg 1260 

ctgctgcccg acaaccacta cctgagcacc cagtccgccc tgagcaaaga ccccaacgag 132 0 

aagcgcgatc acatggtcct gctggagttc gtgaccgccg ccgggatcac tctcggcatg 13 80 

gacgagctgt acaagaagct tagccatggc ttcccgccgg aggtggagga gcaggatgat 144 0 

ggcacgctgc ccatgtcttg tgcccaggag agcgggatgg accgtcaccc tgcagcctgt 150 0 

gcttctgcta ggatcaatgt gtagatgcgc ggccgcgact ctagatcata atcagccata 15 60 

ccacatttgt agaggtttta cttgctttaa aaaacctccc acacctcccc ctgaacctga 162 0 

aacataaaat gaatgcaatt gttgttgtta acttgtttat tgcagcttat aatggttaca 1680 



-42- 



aataaagcaa 
gtggtttgtc 
ttaaaattcg 
ggcaaaatcc 
tggaacaaga 
tatcagggcg 
tgccgtaaag 
aagccggcga 
ctggcaagtg 
c t ac agggcg 
tttttctaaa 
caataatatt 
agttagggtg 
tcaattagtc 
aaagcatgca 
ccctaactcc 
atgcagaggc 
ttggaggcct 
attgaacaag 
tatgactggg 
caggggcgc c 
gacgaggcag 
gacgttgtca 
ctcctgtcat 
cggctgcata 
gagcgagcac 
catcaggggc 
gaggatctcg 
cgcttttctg 
gcgttggcta 
gtgctttacg 
gagttcttct 
catcacgaga 
tccgggacgo 
accctagggg 
tgacggcaat 

gggttcggtc 

acgcccgcgt 
tcgcagccaa 
agattgattt 
atctcatgac 
aaaagatcaa 
caaaaaaacc 
ttccgaaggt 
cgtagttagg 
tcctgttacc 
gacgatagtt 
ccagcttgga 
gcgccacgct 
caggagagcg 
ggtttcgcca 
tatggaaaaa 
ctcacatgtt 
at 



tagcatcaca 
caaactcatc 
cgttaaattt 
cttataaatc 
gtccactatt 
atggcccact 
cactaaatcg 
acgtggcgag 
tagcggtcac 
cgtcaggtgg 
tacattcaaa 
gaaaaaggaa 
tggaaagtcc 
agcaaccagg 
tctcaattag 
gcccagttcc 
cgaggccgcc 
aggcttttgc 
atggattgca 
cacaacagac 
cggttctttt 
cgcggctatc 
ctgaagcggg 
ctcaccttgc 
cgcttgatcc 
gtactcggat 
tcgcgccagc 
tcgtgaccca 
gat teat cga 
cccgtgatat 
gtatcgccgc 
gagegggact 
tttcgattcc 
cggctggatg 
gaggctaact 
aaaaagacag 
ccagggctgg 
ttcttccttt 
cgtcggggcg 
aaaacttcat 
caaaatccct 
aggatcttct 
accgctacca 
aactggcttc 
ccaccacttc 
agtggctgct 
aceggataag 
gcgaacgacc 
tcccgaaggg 
cacgagggag 
cctctgactt 
cgccagcaac 
ctttcctgcg 



aatttcacaa 
aatgtatctt 
ttgttaaatc 
aaaagaatag 
aaagaacgtg 
aegtgaacca 
gaaccctaaa 
aaaggaaggg 
getgegegta 
cacttttegg 
tatgtatccg 
gagtcctgag 
ccaggctccc 
tgtggaaagt 
tcagcaacca 
gcccattctc 
tcggcctctg 
aaagatcgat 
cgcaggttct 
aatcggctgc 
tgtcaagacc 
gtggctggcc 
aagggactgg 
tcctgccgag 
ggctacctgc 
ggaagccggt 
cgaactgttc 
tggegatgee 
ctgtggccgg 
tgctgaagag 
tcccgattcg 
ctggggttcg 
accgccgcct 
atcctccagc 
gaaacacgga 
aataaaaege 
cactctgtcg 
tccccacccc 
gcaggccctg 
ttttaattta 
taacgtgagt 
tgagatcctt 
gcggtggttt 
ageagagege 
aagaactctg 
gccagtggcg 
gcgcagcggt 
tacaccgaac 
agaaaggegg 
cttccagggg 
gagegtcgat 
gcggcctttt 
ttatcccctg 



ataaagcatt 
aaggcgtaaa 
agctcatttt 
accgagatag 
gactccaacg 
tcaccctaat 
gggagccccc 
aagaaagega 
accaccacac 
ggaaatgtgc 
ctcatgagac 
gcggaaagaa 
cagcaggcag 
ccccaggctc 
tagtcccgcc 
cgccccatgg 
agctattcca 
caagagacag 
ccggccgctt 
tetgatgecg 
gacctgtccg 
aegaegggeg 
ctgctattgg 
aaagtatcca 
ccattcgacc 
cttgtcgatc 
gccaggctca 
tgettgeega 
ctgggtgtgg 
cttggcggcg 
cagcgcatcg 
aaatgaccga 
tctatgaaag 
geggggatet 
aggagacaat 
acggtgttgg 
ataccccacc 
accccccaag 
ccatagcctc 
aaaggatcta 
tttcgttcca 
tttttctgcg 
gtttgccgga 
agataccaaa 
tagcaccgcc 
ataagtegtg 
egggctgaac 
tgagatacct 
acaggtatcc 
gaaacgcctg 
ttttgtgatg 
tacggttcct 
attctgtgga 



tttttcactg 
ttgtaagcgt 
ttaaccaata 
ggttgagtgt 
tcaa agggcg 
caagtttttt 
gatttagagc 
aaggagcggg 
ccgccgcgct 
gcggaacccc 
aataaccctg 
ccagctgtgg 
aagtatgcaa 
cccagcaggc 
cctaactccg 
ctgactaatt 
gaagt agtga 
gatgaggatc 
gggtggagag 
ccgtgttccg 
gtgccctgaa 
ttccttgcgc 
gcgaagtgcc 
tcatggctga 
accaagegaa 
aggatgatct 
aggegagcat 
atatcatggt 
cggaccgc t a 
aatgggctga 
ccttctatcg 
ccaagcgacg 
gttgggcttc 
catgetggag 
aceggaagga 
gtcgtttgtt 
gagaccccat 
ttcgggtgaa 
aggttactca 
ggtgaagatc 
ctgagegtea 
cgtaatctgc 
tcaagagcta 
tactgtcctt 
tacatacctc 
tettaceggg 

ggggggttcg 

acagegtgag 
ggtaagegge 
gtatctttat 
ctegtcaggg 
ggccttttgc 
taacegtatt 



cattctagtt 
taatattttg 
ggecgaaate 
tgttccagtt 
aaaaaccgtc 

ggggtcgagg 

ttgacgggga 
cget agggcg 
taatgcgccg 
tatttgttta 
ataaatgett 
aatgtgtgtc 
ageatgeate 
agaagtatgc 
cccatcccgc 
ttttttattt 
ggaggctttt 
gtttcgcatg 
getattegge 
gctgtcagcg 
tgaactgcaa 
agctgtgctc 
ggggcaggat 
tgeaatgegg 
acatcgcatc 
ggacgaagag 
gec egaegge 
ggaaaatggc 
tcaggacata 
ccgcttcctc 
ccttcttgac 
cccaacctgc 
ggaatcgttt 
ttcttcgccc 
acccgcgcta 
cat aaacgcg 

t ggggc c aa t 
ggc c c agggc 
tatatacttt 
ctttttgata 
gaccccgtag 
tgcttgcaaa 
ccaactcttt 
ctagtgtagc 
getctgetaa 
ttggactcaa 
tgcacacagc 
ctatgagaaa 
agggteggaa 
agtcctgtcg 
gggeggagee 
tggccttttg 
accgccatgc 



X740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4862 



<210> 88 
<211> 5192 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pIRESpuro2 plasmid from Clontech 
<400> 88 

gaeggategg gagatctccc gatcccctat ggtcgactct cagtacaatc tgctctgatg 60 
ccgcatagtt aagecagtat ctgctccctg cttgtgtgtt ggaggtcget gagtagtgcg 12 0 
cgagcaaaat ttaagctaca acaaggcaag gcttgaccga caattgeatg aagaatctgc 180 



-43- 



ttagggttag 
gattattgac 
tggagttccg 
cccgcccatt 
attgacgtca 
atcatatgcc 
atgcccagta 
tcgctattac 
ac t cacgggg 
aaaatcaacg 
gtaggcgtgt 
ctgcttactg 
gagctcggat 
ctccggattc 
attcgctgtc 
cttctgcgct 
cggtgatgcc 
caagcttgag 
ctttgccttt 
tagggcggcc 
ttggaataag 
ggcaatgtga 
tcccctctcg 
gaagcttctt 
cctggcgaca 
gcacaacccc 
tcaagcgtat 
gatctggggc 
ggccccccga 
aacccacaag 
gcgacgacgt 
cgcgccacac 
tcctcacgcg 
tggcggtctg 
cgcgcatggc 
tggcgccgc a 
accaccaggg 
gcgccggggt 
ggctcggctt 
tgacccgcaa 
gcgcacgacc 
gcccccgagg 
ccagccatct 
cactgtcctt 
tattctgggg 
gcatgctggg 
gagtgcattc 
cgtcgacctc 
gttatccgct 
gtgcctaatg 
cgggaaacct 
tgcgtattgg 
tgcggcgagc 
ataacgcagg 
ccgcgttgct 
gctcaagtca 
gaagctccct 
ttctcccttc 
fcgtaggt cgt 
gcgccttatc 
tggcagcagc 
tcttgaagtg 
tgctgaagcc 
ccgctggtag 
ctcaagaaga 
gttaagggat 
aaaaatgaag 



gcgttttgcg 
tagttattaa 
cgttacataa 
gacgtcaata 
atgggtggac 
aagtacgccc 
catgacctta 
catggtgatg 
atttccaagt 
ggactttcca 
acggtgggag 
gcttatcgaa 
cgatatctgc 
gaattcggat 
tgcgagggcc 
aagattgtca 
tttgagggtg 
gtgtggcagg 
ctctccacag 
aattccgccc 
gccggtgtgc 
gsercccggaa 
ccaaaggaat 
gaagacaaac 
ggtgcctctg 
agtgccacgt 
tcaacaaggg 
ctcggtgcac 
accacgggga 
gagacgacct 
cccccgggcc 
cgtcgacccg 
cgtcgggctc 
gaccacgccg 
cgagt t gage 
ccggcccaag 
caagggtctg 
gcccgccttc 
caccgtcacc 
gcccggtgcc 
ccatggctcc 
cccaccgact 
gttgtttgcc 
tcctaataaa 

ggtggggtgg 
gatgcggtgg 
tagttgtggt 
tagctagagc 
cacaattcca 
agtgagctaa 
gtcgtgccag 
gcgctcttcc 
ggtatcagct 
aaagaacatg 
ggcgtttttc 
gaggtggcga 
cgtgcgctct 
gggaagcgtg 
tcgctccaag 
eggt aac t at 
cactggtaac 
gtggcctaac 
agttaccttc 
cggtggtttt 
tcctttgatc 
tttggtcatg 
ttttaaatca 



ctgcttcgcg 
tagtaatcaa 
ettaeggtaa 
atgacgtatg 
tatttaeggt 
cctattgacg 
tgggactttc 
cggttttggc 
ctccacccca 
aaatgtcgta 
gtctatataa 
attaatacga 
ggectagcta 
ccgcggccgc 
agctgttggg 
gtttccaaaa 
gccgcgtcca 
cttgagatct 
gtgtccacfcc 
ctctccctcc 
gtttgtctat 
acctggccct 
gcaaggtctg 
aacgtctgta 
eggecaaaag 
tgtgagttgg 
gctgaaggat 
atgetttaca 
cgtggttttc 
tccatgaccg 
gtacgcaccc 
gaccgccaca 
gaeateggea 
gagagegteg 
ggttcccggc 
gagcccgcgt 
ggcagcgccg 
ctggagacct 
gccgacgtcg 
tgacgcccgc 
gaccgaagcc 
ctagagctcg 
cctcccccgt 
afcgaggaaat 
ggcaggacag 
gctctatggc 
ttgtccaaac 
ttggcgtaat 
cacaacatac 
ctcacattaa 
ctgcattaat 
gcttcctcgc 
cactcaaagg 
tgagcaaaag 
cataggctcc 
aacccgacag 
cctgttccga 
gcgctttctc 
ctgggctgtg 
cgtcttgagt 
aggattagca 
tacggctaca 
ggaaaaagag 
tttgtttgca 
ttttctaegg 
agattatcaa 
atctaaagta 



atgtacgggc 
ttacggggtc 
atggcccgcc 
ttcccatagt 
aaactgccca 
teaatgaegg 
ctacttggca 
agtacatcaa 
ttgacgtcaa 
acaactccgc 
gcagagctct 
ctcactatag 
gegcttaagg 
atagataact 
gtgagtactc 
acgaggagga 
tctggtcaga 
ggccatacac 
ccaggtccaa 
ccccccccta 
atgtgatttt 
gtcttcttga 
ttgaatgtcg 
gcgacccttt 
ccacgtgtat 
atagttgtgg 
geccagaagg 
tgtgtttagt 
ctttgaaaaa 
agtacaagee 
tcgccgccgc 
tegagegggt 
aggtgtgggt 
aageggggge 
tggccgcgca 
ggttcctggc 
tcgtgctccc 
ccgcgccccg 
agtgcccgaa 
cccacgaccc 
gacccgggcg 
ctgatcagcc 
gccttccttg 
tgeategcat 
caagggggag 
ttctgaggcg 
tcatcaatgt 
catggtcata 
gagceggaag 
ttgcgttgcg 
gaateggeca 
tcactgactc 
eggtaatacg 
gecagcaaaa 
gcccccctga 
gactataaag 
ccctgccgct 
aatgctcacg 
tgcacgaacc 
ccaacccggt 
gagegaggta 
ctagaaggac 
ttggtagctc 
agcagcagat 
ggtctgaege 
aaaggatctt 
tatatgagta 



cagatatacg 
attagttcat 
tggctgaccg 
aacgecaata 
cttggcagta 
taaafcggccc 
gtacatctac 
tgggcgtgga 
tgggagtttg 
cccattgacg 
ctggctaact 
ggagacccaa 
cctgttaacc 
gatccagtgt 
cctctcaaaa 
tttgatattc 
aaagacaat c 
ttgagtgaca 
ctgeaggteg 
acgttactgg 
ccaccatatt 
cgagcattcc 
tgaaggaagc 
geaggcageg 
aagatacacc 
aaagagtcaa 
taccccattg 
cgaggttaaa 
cacgatgata 
cacggtgcgc 
gttcgccgac 
caccgagctg 
cgcggacgac 
ggtgttcgcc 
gcaacagatg 
caccgt egge 
cggagtggag 
caacctcccc 
ggaccgcgcg 
gcagcgcccg 
gccccgccga 
tcgactgtgc 
accctggaag 
tgtctgagta 
gattgggaag 
gaaagaacc a 
atcttatcat 
gctgtttcct 
cataaagtgt 
ctcactgccc 
aegegegggg 
gctgcgctcg 
gttatccaca 
ggecaggaac 
cgagcatcac 
ataccaggcg 
taceggatae 
ctgtaggtat 
ccccgttcag 
aagacacgac 
tgtaggcggt 
agtafcttggt 
ttgatcegge 
tacgegcaga 
tcagtggaac 
cacctagatc 
aacttggtct 



cgttgacatt 
ageccatata 
cccaacgacc 
gggactttcc 
catcaagtgt 
gectggcatt 
gtattagtca 
tagcggtttg 
ttttggcacc 
caaatgggcg 
agagaaccca 
gcttggtacc 
gg t cgt aegt 
gctggaatta 
gcgggcatga 
acctggcccg 
tttttgttgt 
atgacatcca 
ageatgeate 
ccgaagccgc 
geegtctttt 
taggggtctt 
agttcctctg 
gaacccccca 
tgcaaaggcg 
atggctctcc 
tatgggatct 
aaaaegtcta 
agcttgccac 
ctcgccaccc 
taccccgcca 
caagaactct 
ggcgccgcgg 
gagateggee 
gaaggcctcc 
gtctcgcccg 
gcggccgagc 
ttctacgagc 
acctggtgca 
accgaaagga 
ccccgcaccc 
cttctagttg 
gtgccactcc 
ggtgtcattc 
acaatagcag 
gctggggctc 
gtctgtatac 
gtgtgaaatt 
aaagcctggg 
gctttccagt 
agaggeggtt 
gtcgttcggc 
gaatcagggg 
cgtaaaaagg 
aaaaatcgac 
tttccccctg 
ctgtccgcct 
etcagttegg 
cccgaccgct 
ttatcgccac 
gctacagagt 
atctgcgctc 
aaa c aaac ca 
aaaaaaggat 
gaaaactcac 
cttttaaatt 
gacagttacc 



240 

3O0 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

12O0 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

252 0 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

30O0 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

4020 

4080 

4140 

4200 



aatgcttaat 
cctgactccc 
ctgcaatgat 
cagccggaag 
ttaattgttg 
ttgccattgc 
ccggttccca 
gctccttcgg 
ttatggcagc 
ctggtgagta 
gcccggcgtc 
ttggaaaacg 
cgatgtaacc 
ctgggtgagc 
aatgttgaat 
gtctcatgag 
gcacatttcc 



cagtgaggca 
cgtcgtgtag 
accgcgagac 
ggccgagcgc 
ccgggaagct 
tacaggcatc 
acgatcaagg 
tcctccgatc 
actgcataat 
ctcaaccaag 
aatacgggat 
ttcttcgggg 
cactcgtgca 
aaaaacagga 
actcatactc 
cggatacata 
ccgaaaagtg 



cctatctcag 
ataactacga 
ccacgctcac 
agaagtggtc 
agagtaagta 
gtggtgtcac 
cgagttacat 
gttgtcagaa 
tctcttactg 
tcattctgag 
aataccgcgc 
cgaaaactct 
cccaactgat 
aggcaaaatg 
ttcctttttc 
tttgaatgta 
ccacctgacg 



cgatctgtct 
tacgggaggg 
cggctccaga 
ctgcaacttt 
gttcgccagt 
gctcgtcgtt 
gatcccccat 
gtaagttggc 
tcatgccatc 
aatagtgtat 
cacatagcag 
caaggatctt 
cttcagcatc 
ccgcaaaaaa 
aatattattg 
tttagaaaaa 
tc 



atttcgttca 
cttaccatct 
tttatcagca 
atccgcctcc 
taatagtttg 
tggtatggct 
gttgtgcaaa 
cgcagtgtfca 
cgtaagatgc 
gcggcgaccg 
aactfctaaaa 
accgctgttg 
ttttactttc 
gggaataagg 
aagcatttat 
taaacaaata 



tccatagttg 
ggccccagtg 
afcaaaccagc 
atccagtcta 
cgcaacgttg 
tcattcagct 
aaagcggtta 
tcactcatgg 
ttttctgtga 
agttgctctt 
gtgcfccatca 
agatccagtt 
accagcgttt 
gcgacacgga 
cagggttatt 
ggggttccgc 



4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 
5100 
5160 
5192 



<210> 89 
<211> 11182 
<212> DNA 

<213> Artificial Sequence 
<220> 

<2 2 3 > pAgl Plasmid 



<400> 89 

catgccaacc 

atagtgcagt 

agtcctaagt 

gttttagtcg 

agagcgccgc 

ccaaccaacg 

ccggcaccag 

acgttgtgac 

ttgccgagcg 

acaccaccac 

agcgttcccfc 

tgaagtttgg 

tcgaccagga 

ccctgtaccg 

gtgccttccg 

gccaagagga 

cgaagagatc 

ctcaaccgtg 

gccggccagc 

tgagtaaaac 

aatacgcaag 

aagacgacca 

ttagtcgatt 

ccgctaaccg 

cggcgcgact 

atcaaggcag 

accgccgacc 

gcggcctttg 

gcgctggccg 

ccaggcactg 

cgcgaggtcc 

aagagaaaat 

gcaaggctgc 

agttgccggc 

ttaccgagct 

atgagtagat 

accgacgccg 

tgggttgtct 

cggtcgcaaa 

gaagfctgaag 



acagggttcc 
cggcttctga 
tacgcgacag 
cataaagtag 
cgctggcctg 
ggccgaactg 
gcgcgaccgc 
ag tgac cagg 
catccaggag 
gccggccggc 
aatcatcgac 
cccccgccct 
aggccgcacc 
cgcacttgag 
tgaggacgca 
acaagcatga 
gaggcggaga 
cggctgcatg 
ttggccgctg 
agcttgcgtc 
gggaacgcat 
tcgcaaccca 
ccgatcccca 
ttgtcggcat 
tcgtagtgat 
ccgacttcgt 
tggtggagct 
tcgtgtcgcg 
ggtacgagct 
ccgccgccgg 
aggcgctggc 
gagcaaaagc 
aacgttggcc 
ggaggatcac 
gctatctgaa 
gaattttagc 
tggaatgccc 
gccggccctg 
ccatccggcc 
gccgcgcagg 



cctcgggatc 
cgttcagtgc 
gctgccgccc 
aatacttgcg 
ctgggctatg 
cacgcggccg 
ccggagctgg 
ctagaccgcc 
gccggcgcgg 
cgcatggtgt 
cgcacccgga 
accctcaccc 
gtgaaagagg 
cgcagcgagg 
ttgaccgagg 
aaccgcacca 
tgatcgcggc 
aaatcctggc 
aagaaaccga 
atgcggtcgc 
gaaggt tatc 
tctagcccgc 
gggcagtgcc 
cgaccgcccg 
cgacggagcg 
gctgat tccg 
ggttaagcag 
ggcgatcaaa 
gcccattctt 
cacaaccgtt 
cgctgaaatt 
acaaacacgc 
agcctggcag 
accaagctga 
tacatcgcgc 
ggctaaagga 
catgtgtgga 
caatggcact 
cggtacaaat 
ccgcccagcg 



aaagtacttt 
agccgtcttc 
tgcccttttc 
actagaaccg 
cccgcgtcag 
gctgcaccaa 
c cagg at get 
tggcccgcag 
gectgegtag 
tgaccgtgtt 
gegggegega 
eggcacagat 
cggctgcact 
aagtgacgcc 
ccgacgccct 
ggacggccag 
egggtaegtg 
cggtttgtct 
gcgccgccgt 
tgcgtatatg 
gctgtactta 
gccctgcaac 
cgcgattggg 
acgattgacc 
ccccaggcgg 
gtgcagccaa 
cgcattgagg 
ggcacgcgca 
gagtccegta 
cttgaatcag 
aaatcaaaac 
taagtgccgg 
acacgccagc 
agatgtaege 
agctaccaga 
ggcggcatgg 
ggaacgggcg 
ggaaccccca 
cggcgcggcg 
gcaacgcatc 



gatccaaccc 
tgaaaacgac 
ctggcgtttt 
gagacattac 
caccgacgac 
gctgttttcc 
tgaccaccta 
cacccgcgac 
cctggcagag 
cgccggcatt 
ggccgccaag 
cgcgcacgcc 
gcttggcgtg 
caccgaggcc 
ggegg c eg c c 
gaegaacegt 
ttcgagccgc 
gatgecaage 
ctaaaaaggt 
atgegatgag 
accagaaagg 
tcgccggggc 
cggccgtgcg 
gcgacgtgaa 
eggacttgge 
gcccttacga 
teaeggatgg 
tcggcggtga 
tcacgcagcg 
aacccgaggg 
tcatttgagt 
ccgtccgagc 
catgaagegg 
ggtacgccaa 
gtaaatgagc 
aaaatcaaga 
gttggccagg 
ageccgagga 
ctgggtgatg 
gaggcagaag 



ctccgctgct 
atgtcgcaca 
cttgtcgcgt 
gecatgaaca 
caggacttga 
gagaagatca 
cgccctggcg 
ctactggaca 
ccgtgggccg 
gecgagtteg 
gcccgaggcg 
cgcgagctga 
catcgctcga 
aggeggegeg 
gagaatgaac 
ttttcattac 
ccgcgcacgt 
tggcggcctg 
gatgtgtatt 
taaataaaca 
egggtcagge 
cgatgttctg 
ggaagatcaa 
ggccatcggc 
tgtgtccgcg 
catatgggee 
aaggctacaa 
ggttgccgag 
cgtgagctac 
cgacgctgcc 
t aa tgaggt a 
gcacgcagca 
gtcaactttc 
ggcaagacca 
aaatgaataa 
acaaccaggc 
egtaagegge 
atcggcgtga 
acctggtgga 
cacgccccgg 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 



-45- 



tgaatcgtgg caagcggccg ctgatcgaat 
cggtgcgccg tcgattagga agccgcccaa 
gatgctctat gacgtgggca cccgcgatag 
tctgtcgaag cgtgaccgac gagctggcga 
cgtagaggtfc tccgcagggc cggccggcat 
gatggcggtfc tcccatctaa ccgaatccat 
gcccggccgc gtgttccgtc cacacgttgc 
tggcggaaag cagaaagacg acctggtaga 
tgccatgcag cgfcacgaaga aggccaagaa 
agccttgatt agccgctaca agatcgtaaa 
gatcgagcta gctgattgga tgtaccgcga 
gacggttcac cccgattact ttttgatcga 
ggcacgccgc gccgcaggca aggcagaagc 
cagtggcagc gccggagagt tcaagaagtt 
aaatgacctg ccggagtacg atttgaagga 
catgcgctac cgcaacctga tcgagggcga 
gatgcfcaggg caaattgccc tagcagggga 
tagcacgtac attgggaacc caaagccgta 
cccaaagccg tacattggga accggtcaca 
aggcgatttt tccgcctaaa actctttaaa 
ctgtgcataa ctgtctggcc agcgcacagc 
gtcgctgcgc tcocfcacgcc ccgccgcttc 
aaaaatggct ggcctacggc caggcaatct 
actcgaccgc cggcgcccac atcaaggcac 
aaaacctctg acacatgcag ctcccggaga 
ggagcagaca agcccgtcag ggcgcgtcag 
fcgacccagfcc acgtagcgat agcggagtgt 
gattgtactg agagtgcacc atatgcggtg 
ataccgcatc aggcgctctt ccgcfctcctc 
gctgcggcga gcggtatcag ctcactcaaa 
ggataacgca ggaaagaaca tgtgagcaaa 
ggccgcgttg ctggcgtttt tccataggct 
acgctcaagt cagaggtggc gaaacccgac 
tggaagctcc ctcgtgcgct ctcctgttcc 
ctttctccct tcgggaagcg tggcgctttc 
ggtgfcaggtc gttcgctcca agctgggctg 
ctgcgcctta tccggtaact atcgtcttga 
actggcagca gccactggta acaggattag 
gtfcctfcgaag tggtggccta actacggcta 
tctgctgaag ccagttacct tcggaaaaag 
caccgctggt agcggtggtt tttttgtttg 
atctcaagaa gatcctttga tcttttctac 
acgttaaggg attttggtca tgcattctag 
atattttatt ttctcccaat caggcttgat 
ctgttcttcc ccgatatcct ccctgatcga 
gtccgccctg ccgcttctcc caagatcaat 
gatgttgctg tctcccaggt cgccgtggga 
ctttaaaaaa tcatacagct cgcgcggatc 
gcaatccaca tcggccagat cgttattcag 
taagctattc gtatagggac aatccgatat 
cgcatacagc tcgataatct tttcagggct 
gacgccatcg gcctcactca tgagcagatt 
gacctttgga acaggcagct ttccttccag 
atcataggtg gtccctttat accggctgtc 
tcccaccagc ttatatacct tagcaggaga 
tttttcgatc agttttttca attccggtga 
tcctcttttc tacagtattt aaagataccc 
aattcactgt tccfctgcatt ctaaaacctt 
ttttcaaagt tggcgtataa catagtatcg 
caggcagcaa cgctctgtca tcgttacaat 
gtttcaaacc cggcagctta gttgccgttc 
tctgccgcct tacaacggct ctcccgctga 
cgagtggtga ttttgtgccg agctgccggt 
tatattgtgg tgtaaacaaa ttgacgctta 
taatgtactg aattaacgcc gaattaattc 
gttttaggaa ttagaaattt tattgataga 
ggtttcttat atgctcaaca catgagcgaa 



ccgcaaagaa tcccggcaac cgccggcagc 2460 
gggcgacgag caaccagatt fctttcgtfccc 252 0 
tcgcagcatc atggacgfcgg ccgttttccg 25 80 
ggtgatccgc tacgagcttc cagacgggca 2640 
ggccagtgtg tgggattacg acctggtact 2700 
gaaccgatac cgggaaggga agggagacaa 2760 
ggacgtactc aagttctgcc ggcgagccga 2820 
aacctgcatt cggttaaaca ccacgcacgt 28 80 
cggccgcctg gtgacggtat ccgagggtga 294 0 
gagcgaaacc gggcggccgg agtacatcga 30 00 
gatcacagaa ggcaagaacc cggacgtgct 30 60 
tcccggcatc ggccgttttc tctaccgcct 3X20 
cagatggttg ttcaagacga tctacgaacg 3180 
ctgtttcacc gtgcgcaagc tgatcgggtc 3240 
ggaggcgggg caggctggcc cgatcctagt 33 00 
agcatccgcc ggttcctaat gtacggagca 33 6 0 
aaaaggtcga aaaggtctct ttcctgtgga 342 0 
cattgggaac cggaacccgt acattgggaa 34 80 
catgtaagtg actgatataa aagagaaaaa 354 0 
acttattaaa actcttaaaa cccgcctggc 3600 
cgaagagctg caaaaagcgc ctacccttcg 3 66 0 
gcgtcggcct atcgcggccg ctggccgctc 372 0 
accagggcgc ggacaagccg cgccgtcgcc 3780 
cctgcctcgc gcgtttcggt gatgacggtg 384 0 
cggtcacagc ttgtctgtaa gcggatgccg 3900 
cgggtgttgg cgggtgtcgg ggcgcagcca 3 96 0 
atactggctt aactatgcgg catcagagca 4 02 0 
tgaaataccg cacagatgcg taaggagaaa 4 08 0 
gctcactgac tcgctgcgct cggtcgttcg 414 0 
ggcggtaata cggttatcca cagaatcagg 420 0 
aggccagcaa aaggccagga accgtaaaaa 42 60 
ccgcccccct gacgagcatc acaaaaatcg 432 0 
aggactataa agataccagg cgtttccccc 43 80 
gaccctgccg cttaccggat acctgtccgc 444 0 
tcatagctca cgctgtaggt atctcagttc 45 0 0 
tgtgcacgaa ccccccgttc agcccgaccg 456 0 
gtccaacccg gtaagacacg acttatcgcc 4 62 0 
cagagcgagg tatgtaggcg gtgctacaga 4 68 0 
cactagaagg acagtatttg gtatctgcgc 4 74 0 
agttggtagc tcttgatccg gcaaacaaac 4 80 0 
caagcagcag attacgcgca gaaaaaaagg 4 86 0 
ggggtctgac gctcagtgga acgaaaactc 4 92 0 
gtactaaaac aattcatcca gtaaaatata 4 98 0 
ccccagtaag tcaaaaaata gctcgacata 5 04 0 
ccggacgcag aaggcaatgt cataccactt 510 0 
aaagccactt actttgccat ctttcacaaa 516 0 
aaagacaagt tcctcttcgg gcttttccgt 522 0 
tttaaatgga gtgtcttctt cccagttttc 5 28 0 
taagtaatcc aattcggcta agcggctgtc 534 0 
gfccgatggag tgaaagagcc tgatgcactc 540 0 
ttgttcatct tcatactctt ccgagcaaag 5460 
gctccagcca tcatgccgtt caaagtgcag 5 52 0 
ccatagcatc atgtcctttt cccgttccac 5580 
cgtcattttt aaatataggt tttcattttc 5 64 0 
cattccttcc gtatctttta cgcagcggta 5700 
tattctcatt ttagccattt attatttcct 5760 
caagaagcta attataacaa gacgaactcc 5 82 0 
aaataccaga aaacagcttt ttcaaagttg 5 88 0 
acggagccga ttttgaaacc gcggtgatca 594 0 
caacatgcta ccctccgcga gatcatccgt 6 0O0 
ttccgaatag catcggtaac atgagcaaag 6 06 0 
cgccgtcccg gactgatggg ctgcctgtat 612 0 
cggggagctg ttggctggct ggtggcagga 6180 
gacaacttaa taacacattg cggacgtttt 624 0 
gggggatctg gattttagta ctggattttg 630 0 
agtattttac aaatacaaat acatactaag 6360 
accctatagg aaccctaatt cccttatctg 6420 
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ggaactacfcc 
ggacggggcg 
ccgtgcttga 
atgcgcacgc 
gcctccaggg 
cggggggaga 
gggcccgcgt 
cgctcccgca 
aagttgaccg 
gc c t egg t gg 
gagatagatt 
ttccttatat 
agtggagata 
cacgatgctc 
aacgatagee 
tgtccttttg 
taccctttgt 
cttggagtag 
agacgfcggtt 
gggaccactg 
tttgfcaggtg 
atggaatccg 
gtcttctgag 
gttggcaagc 
taatgeaget 
aatgtgagtt 
atgtfcgtgtg 
tacgaattcg 
gagtttggac 
gatgetattg 
gaactccagc 
tccgaagccc 
gtcctgctcc 
ccgcccccac 
cgtggacacg 
ggccagggtg 
gtcccggacc 
ggtccagaac 
caacttggcc 
gcaggaattc 
accaaagggc 
attgcccagc 
aatgecatea 
ccaaagatgg 
cttcaaagca 
agaatatcaa 
taatatcggg 
cagt agaaaa 
ttcaagatgc 
tggaaaaaga 
ctgacgtaag 
aagttcattt 
tctctcgagc 
egaegtctgt 
teteggaggg 
tgcgggtaaa 
catcggccgc 
ectattgeat 
tgcccgctgt 
gecagacgag 
gtgatttcat 
acaccgtcag 
gccccgaagt 
atggccgcat 
aggtcgecaa 
acttcgagcg 
gcattggtct 



acacattatt 
gtaccggcag 
agccggccgc 
tegggtegtt 
acttcagcag 
cgtacacggt 
aggegatgee 
gaeggacgag 
tgcttgtctc 
caeggeggat 
tgtagagaga 
agaggaaggfc 
tcacatcaat 
ctcgtgggtg 
tttcctttat 
atgaagtgac 
tgaaaagtct 
acgagagtgt 
ggaaegtett 
teggcagagg 
ccaccttcct 
aggaggtttc 
actgtatctt 
tgctctagcc 
ggcacgacag 
agctcactca 
gaattgtgag 
agcctfcgact 
aaaceacaac 
ctttatttgt 
afcgagafcccc 
aacctttcat 
tcggccacga 
ggctgctcgc 
acctccgacc 
ttgtccggca 
acaccggcga 
tcgaccgctc 
atggatccag 
gatcgacact 
tattgagact 
tatctgtcac 
ttgcgataaa 
acccccaccc 
agtggattga 
agatacagtc 
aaacctcctc 
ggaaggtggc 
ctctgccgac 
agaegttcca 
ggafcgacgca 
catttggaga 
tttegcagat 
cgagaagttt 
cgaagaatct 
tagctgcgcc 
gctcccgatfc 
ctcccgccgt 
tctacaaccg 
egggttegge 
atgegegatt 
tgcgtccgtc 
ccggcacctc 
aacageggtc 
catcttcfctc 
gaggcatccg 
tgaccaactc 



atggagaaac 
gctgaagtcc 
ccgcagcatg 
gggcagcccg 
gtgggtgtag 
cgactcggcc 
ggcgacctcg 
gtcgtccgtc 
gatgtagtgg 
gtcggccggg 
gactggtgat 
ettgegaagg 
ccacfctgctt 
ggggtccatc 
cgcaatgatg 
agatagctgg 
caatagccct 
cgtgctccac 
ctttttccac 
catcttgaac 
tttctactgt 
ccgatattac 
tgatattctt 
aatacgcaaa 
gtttcccgac 
ttaggcaccc 
eggataacaa 
agagggtcga 
tagaatgeag 
aaccattata 
cgcgctggag 
agaaggegge 
agtgcacgca 
egatcteggt 
acteggegta 
ccacctggtc 
agtcgtcctc 
cggcgacgtc 
atttegctea 
ctcgtctact 
tttcaacaaa 
fctcatcaaaa 
ggaaaggcta 
acgaggagca 
tgtgataaca 
tcagaagacc 
ggattccatt 
acctacaaat 
agtggtccca 
accacgtctt 
caatcccact 
ggacacgctg 
ceggggggge 
ctgatcgaaa 
cgtgctttca 
gatggtttct 
ccggaagtgc 
gcacagggtg 
gtcgeggagg 
ccattcggac 
gctgatcccc 
gcgcaggctc 
gtgeacgegg 
attgactgga 
tggaggccgt 
gagcttgeag 
tatcagagct 



tcgagtcaaa 
agetgecaga 
ccgcgggggg 
atgacagega 
agcgtggagc 
gtccagtcgt 
ccgtccacct 
cactcctgcg 
ttgacgatgg 
cgtcgfcfcct.g 
ttcagcgtgt 
atagtgggat 
tgaagacgtg 
fcfcfcgggacca 
gcatttgtag 
gcaatggaat 
ttggtcttct 
catgttatca 
gatgctcctc 
gatagecttt 
ccttttgatg 
cctttgttga 
ggagtagacg 
ccgcctctcc 
tggaaagegg 
caggctttac 
tttcacacag 
eggtatacag 
tgaaaaaaat 
agetgeaata 
gatcatccag 
ggtggaatcg 
gttgccggcc 
catggccggc 
cagctcgtcc 
ctggaccgcg 
cacgaagt cc 
gcgcgcggtg 
agttagtata 
ccaagaatat 
gggtaatatc 
ggacagtaga 
tegttcaaga 
tcgtggaaaa 

tggtggagca 

aaagggctat 
gcccagctat 
gecatcat tg 
aagatggacc 
caaagcaagt 
atccttcgca 
aaatcaccag 
aatgagatat 
agttcgacag 
gcttcgatgt 
acaaagatcg 
ttgacattgg 
teaegttgea 
ctatggatgc 
cgcaaggaat 
atgtgtatca 
tcgatgagct 
attteggetc 
gegaggegat 
ggttggcttg 
gatcgccacg 
tggttgacgg 



teteggtgae 
aacccacgtc 
catatccgag 
ccacgctctt 
ccagtcccgt 
aggcgttgcg 
eggegacgag 
gttcctgegg 
tgcagaccgc 
ggctcafcggt 
cctctccaaa 
tgtgcgtcat 
gttggaacgt 
cfcgfccggcag 
gtgccacctt 
ccgaggaggt 
gagactgtat 
catcaatcca 
gtgggtgggg 
cctttatcgc 
aagtgacaga 
aaagtctcaa 
agagtgtcgt 
ccgcgcgttg 
gcagtgagcg 
actttatget 
gaaacagcta 
acatgataag 
gctttatttg 
aacaagttgg 
ccggcgtccc 
aaatctcgta 
gggtcgegea 
ccggaggcgt 
aggccgcgca 
ctgatgaaca 
egggagaac c 
ageaceggaa 
aaaaagcagg 
caaagataca 
gggaaacctc 
a a agga agg t 
tgcctctgcc 
agaagacgt t 
cgacactctc 
tgagactttt 
ctgtcacttc 
cgataaagga 
cccacccacg 
ggattgatgt 
agaccttcct 
tctctctcta 
gaaaaagect 
cgtctccgac 
aggagggegt 
ttatgtttat 
ggagt, 1 1 ag c 
agacctgcct 
gategctgeg 
eggtcaatae 
ctggcaaact 
gatgetttgg 
caacaatgtc 
gtteggggat 
tatggagcag 
actccgggcg 
caatttcgat 



gggcaggacc 
atgccagttc 
cgcctcgtgc 
gaagccctgt 
ccgctggtgg 
tgccttccag 
ccagggatag 
cteggtaegg 
cggcatgtcc 
agactcgaga 
tgaaatgaac 
cccttacgtc 
cttctttttc 
aggcatcfcfcg 
ccttttctac 
ttcccgatat 
ctttgatatt 
ettgetttga 
gtccatcttt 
aatgatggca 
tagctgggca 
tagecctttg 
gctccaccat 
gecgattcat 
caaegcaatt 
tccggctcgt 
tgaccatgat 
atacattgat 
tgaaafcttgt 
ggtgggcgaa 
ggaaaacgat 
gcacgtgtca 
gggegaaetc 
cccggaagtt 
cccacaccca 
gggtcacgtc 
cgagccggtc 
cggcactggt 
cttcaatcct 
gtctcagaag 
ctcggattcc 
ggcaccfcaca 
gacagtggtc 
ccaaccacgt 
gtctactcca 
caa caaaggg 
atcaaaagga 
aaggctatcg 
agg age at eg 
gatatctcca 
ctatataagg 
caaatctatc 
gaactcaccg 
ctgatgeage 
ggatatgtcc 
eggcactttg 
gagagectga 
gaaaccgaac 
gecgatctta 
actacatggc 
gtgatggacg 
gcegaggact 
ctgaeggaca 
tcccaatacg 
cagacgcgct 
tatatgetec 
gatgeagett 



6480 

6540 

6600 

6660 

6720 

6780 

6840 

6900 

6960 

7020 

7080 

7140 

7200 

7260 

7320 

7380 

7440 

7500 

7560 

7620 

76B0 

7740 

7800 

7860 

7920 

7980 

8040 

8100 

8160 

8220 

8280 

8340 

8400 

8460 

8520 

8580 

8640 

8700 

8760 

8820 

8880 

8940 

9000 

9060 

9120 

9180 

9240 

9300 

9360 

9420 

9480 

9540 

9600 

9660 

9720 

9780 

9840 

9900 

9960 

10020 

10080 

10140 

10200 

10260 

10320 

10380 

10440 
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gggcgcaggg tcgatgcgac gcaatcgtcc gatccggagc cgggactgtc gggcgtacac 1050 0 

aaatcgcccg cagaagcgcg gccgtctgga ccgatggctg tgtagaagta ctcgccgata 1056 0 

gtggaaaccg acgccccagc actcgtccga gggcaaagaa atagagtaga tgccgaccgg 1062 0 

atctgfccgafc cgacaagctc gagtttctcc ataataatgt gtgagtagfct cccagataag 10680 

ggaattaggg ttcctatagg gtttcgctca tgfcgfctgagc atataagaaa cccttagtat 10740 

gtatttgtat ttgtaaaata cttctatcaa taaaatttct aattcctaaa accaaaatcc 1080 0 

agtactaaaa tccagatccc ccgaafcfcaat tcggcgttaa ttcagatcaa gcttggcact 10860 

ggccgtcgtt ttacaacgtc gtgactggga aaaccctggc gttacccaac ttaatcgcct 1092 0 

tgcagcacat ccccctttcg ccagctggcg taatagcgaa gaggcccgca ccgatcgccc 10980 

ttcccaacag ttgcgcagcc tgaatggcga atgctagagc agcfcfcgagct tggatcagat 1104 0 

tgtcgtttcc cgccttcagt ttaaactatc agtgtttgac aggatatatt ggcgggtaaa 1110 0 

cctaagagaa aagagcgttt attagaataa cggatattta aaagggcgtg aaaaggttta 11160 

tccgttcgtc catttgtatg tg 11182 

<210> 90 
<211> 8428 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pCamt>ia33 0 0 Plasmid 
<400> 90 

catgccaacc acagggttcc cctcgggatc aaagtacttt gatccaaccc ctccgctgct 6 0 
atagtgcagt cggcttctga cgttcagtgc agccgtcttc tgaaaacgac atgtcgcaca 120 
agtcctaagt tacgcgacag gctgccgccc tgcccttttc ctggcgtttt cttgtcgcgt 180 
gttttagtcg cataaagtag aatacttgcg actagaaccg gagacafctac gccatgaaca 240 
agagcgccgc cgctggcctg ctgggctatg cccgcgtcag caccgacgac caggacttga 3 00 
ccaaccaacg ggccgaactg cacgcggccg gctgcaccaa gctgttttcc gagaagatca 3 60 
ccggcaccag gcgcgaccgc ccggagctgg ccaggatgct tgaccaccta cgccctggcg 420 
acgttgtgac agtgaccagg ctagaccgcc tggcccgcag cacccgcgac ctactggaca 4 80 
ttgccgagcg catccaggag gccggcgcgg gcctgcgtag cctggcagag ccgtgggccg 540 
acaccaccac gccggccggc cgcatggtgt tgaccgtgtt cgccggcatt gccgagttcg 600 
agcgttccct aatcatcgac cgcacccgga gcgggcgcga ggccgccaag gcccgaggcg 660 
tgaagtttgg cccccgccct accctcaccc cggcacagat cgcgcacgcc cgcgagctga 720 
tcgaccagga aggccgcacc gtgaaagagg cggctgcact gcttggcgtg catcgctcga 780 
ccctgtaccg cgcacttgag cgcagcgagg aagtgacgcc caccgaggcc aggcggcgcg 840 
gfcgccfctccg tgaggacgca ttgaccgagg ccgacgccct ggcggccgcc gagaatgaac 9 00 
gccaagagga acaagcatga aaccgcacca ggacggccag gacgaaccgt ttttcattac 960 
cgaagagatc gaggcggaga tgatcgcggc cgggtacgtg ttcgagccgc ccgcgcacgt 102 0 
ctcaaccgtg cggctgcatg aaatcctggc cggtttgtct gatgccaagc tggcggcctg 1O80 
gccggccagc ttggccgctg aagaaaccga gcgccgccgt ctaaaaaggt gatgtgtatt 114 0 
tgagtaaaac agctitgcgtc atgcggtcgc tgcgtatatg atgcgatgag taaataaaca 1200 
aatacgcaag gggaacgcat gaaggttatc gctgtactta accagaaagg cgggtcaggc 1260 
aagacgacca tcgcaaccca tctagcccgc gccctgcaac tcgccggggc cgatgttctg 132 0 
ttagtcgatt ccgatcccca gggcagtgcc cgcgattggg cggccgtgcg ggaagatcaa 1380 
ccgctaaccg ttgtcggcat cgaccgcccg acgattgacc gcgacgfcgaa ggccafccggc 144 0 
cggcgcgact tcgtagtgat cgacggagcg ccccaggcgg cggacttggc tgtgtccgcg 150 0 
atcaaggcag ccgacttcgt gctgattccg gtgcagccaa gcccttacga catatgggcc 156 0 
accgccgacc tggtggagct ggttaagcag cgcattgagg tcacggatgg aaggctacaa 162 0 
gcggcctttg tcgtgtcgcg ggcgatcaaa ggcacgcgca tcggcggtga ggttgccgag 168 0 
gcgctggccg ggtacgagct gcccattctt gagtcccgta tcacgcagcg cgtgagctac 174 0 
ccaggcactg ccgccgccgg cacaaccgtt cttgaatcag aacccgaggg cgacgctgcc 180 0 
cgcgaggtcc aggcgctggc cgctgaaatt aaatcaaaac tcatttgagt taatgaggta 186 0 
aagagaaaat gagcaaaagc acaaacacgc taagtgccgg ccgtccgagc gcacgcagca 192 0 
gcaaggctgc aacgttggcc agcctggcag acacgccagc catgaagcgg gtcaactttc 1980 
agttgccggc ggaggatcac accaagctga agatgtacgc ggtacgccaa ggcaagacca 2 04 0 
ttaccgagct gctatctgaa tacatcgcgc agctaccaga gtaaatgagc aaatgaataa 2100 
atgagtagat gaattttagc ggctaaagga ggcggcatgg aaaatcaaga acaaccaggc 216 0 
accgacgccg tggaatgccc catgtgtgga gsfaacgggcg gttggccagg cgtaagcggc 2 22 0 
tgggtfcgtct gccggccctg caatggcact ggaaccccca agcccgagga atcggcgtga 2280 
cggtcgcaaa ccatccggcc cggtacaaat cggcgcggcg ctgggtgatg acctggtgga 234 0 
gaagttgaag gccgcgcagg ccgcccagcg gcaacgcatc gaggcagaag cacgccccgg 2400 
tgaatcgtgg caagcggccg ctgatcgaat ccgcaaagaa tcccggcaac cgccggcagc 246 0 
cggtgcgccg tcgattagga agccgcccaa 999cgacgag caaccagatt ttttcgttcc 252 0 
gatgctctat gacgtgggca cccgcgatag tcgcagcatc atggacgtgg ccgttttccg 2580 
tctgtcgaag cgtgaccgac gagctggcga ggtgatccgc tacgagcttc cagacgggca 2 640 
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cgtagaggtt tccgcagggc cggccggcat 

gatggcggtt tcccatctaa ccgaatccat 

gcccggccgc gtgttccgtc cacacgttgc 

tggcggaaag cagaaagacg acctggtaga 
tgccatgcag cgtacgaaga aggccaagaa 

agccttgatt agccgctaca agatcgtaaa 
gatcgagcta gctgattgga fcgtaccgcga 

gacggttcac cccgattact ttttgatcga 

ggcacgccgc gccgcaggca aggcagaagc 

cagtggcagc gccggagagt tcaagaagtt 
aaatgacctg ccggagtacg atttgaagga 

catgcgctac cgcaacctga fccgagggcga 

gatgctaggg caaafcfcgccc tagcagggga 

tagcacgtac attgggaacc caaagccgta 

cccaaagccg tacattggga accggtcaca 

aggcgatttt tccgcctaaa actctttaaa 

ctgtgcataa ctgtctggcc agcgcacagc 

gtcgctgcgc tccctacgcc ccgccgcttc 

aaaaatggct ggcctacggc caggcaatct 

actcgaccgc cggcgcccac afccaaggcac 

aaaacctctg acacatgcag ctcccggaga 

ggagcagaca agcccgtcag ggcgcgtcag 

tgacccagtc acgtagcgat agcggagtgt 

gattgtacfcg agagtgcacc atatgcggtg 

ataccgcatc aggcgctctt ccgcttcctc 

gctgcggcga gcggtatcag ctcactcaaa 

ggataacgca ggaaagaaca tgtgagcaaa 

ggccgcgttg ctggcgtttt tccataggct 

acgctcaagt cagaggtggc gaaacccgac 

tggaagctcc ctcgtgcgct ctcctgttcc 

ctttctccct tcgggaagcg tggcgctttc 

ggtgtaggtc gttcgctcca agctgggctg 

cfcgcgcctta tccggtaact atcgfccttga 

actggcagca gccactggta acaggattag 

gttcttgaag tggtggccta actacggcta 

tctgctgaag ccagttacct tcggaaaaag 

caccgctggt agcggtggtt fcfcfcfcfcgtttg 

atctcaagaa gatcctttga tcttttctac 

acgttaaggg attttggtca tgcattctag 

atattttatt ttctcccaat caggcttgat 

ctgttcttcc ccgatatcct ccctgatcga 

gtccgccctg ccgcttctcc caagafccaat 

gatgttgctg tctcccaggt cgccgtggga 

ctttaaaaaa tcatacagct cgcgcggatc 

gcaatccaca tcggccagat cgttattcag 

taagctattc gtatagggac aatccgatat 

cgcatacagc tcgataafcct tttcagggct 

gacgccatcg gcctcactca tgagcagatt 

gacctttgga acaggcagct ttccttccag 

afccataggtg gtccctttat accggctgtc 

tcccaccagc ttatatacct tagcaggaga 

tttttcgatc agttttttca attccggtga 

tcctcttttc tacagtattt aaagataccc 

aattcactgt tccttgcatt ctaaaacctt 

ttttcaaagt tggcgtataa catagtatcg 

caggcagcaa cgctctgtca tcgttacaat 

gtttcaaacc cggcagctta gttgccgttc 

tctgccgcct tacaacggct ctcccgctga 

cgagtggtga ttttgtgccg agctgccggt 

tatattgtgg tgtaaacaaa ttgacgctta 

taatgtactg aattaacgcc gaattaattc 

gttttaggaa ttagaaattt tattgataga 

ggtttcttat atgctcaaca catgagcgaa 

ggaactactc acacattatt atggagaaac 

ggacggggcg gtaccggcag gctgaagtcc 

ccgtgcttga agccggccgc ccgcagcatg 

atgcgcacgc tcgggtcgtt gggcagcccg 



ggccagtgtg tgggattacg acctggtact 27 0 0 
gaaccgatac cgggaaggga agggagacaa 2760 
ggacgtactc aagfctctgcc ggcgagccga 282 0 
aacctgcatt cggttaaaca ccacgcacgt 2880 
cggccgcctg gtgacggtat ccgagggtga 2940 
gagcgaaacc gggcggccgg agtacatcga 3 00 0 
gatcacagaa ggcaagaacc cggacgtgct 30 60 
tcccggcatc ggccgttttc tctaccgcct 3120 
cagatggttg ttcaagacga tctacgaacg 3180 
ctgtttcacc gtgcgcaagc tgatcgggtc 324 0 
ggaggcgggg caggctggcc cgatcctagt 33 00 
agcatccgcc ggttcctaat gtacggagca 33 60 
aaaaggtcga aaaggtctct ttcctgtgga 3420 
cattgggaac cggaacccgt acattgggaa 34 80 
catgtaagtg actgatataa aagagaaaaa 354 0 
acttattaaa actcttaaaa cccgcctggc 36 00 
cgaagagctg caaaaagcgc ctacccttcg 3660 
gcgtcggcct atcgcggccg ctggccgcfcc 372 0 
accagggcgc ggacaagccg cgccgtcgcc 37 80 
cctgcctcgc gcgtttcggt gatgacggtg 3840 
cggtcacagc ttgtctgtaa gcggatgccg 3900 
cgggtgttgg cgggtgtcgg ggcgcagcca 3960 
atactggctt aactatgcgg catcagagca 4020 
tgaaataccg cacagatgcg taaggagaaa 40 80 
gctcactgac tcgctgcgct cggtcgttcg 414 0 
ggcggtaata cggttatcca cagaatcagg 42 00 
aggccagcaa aaggccagga accgtaaaaa 4260 
ccgcccccct gacgagcatc acaaaaatcg 432 0 
aggacfcataa agataccagg cgtttccccc 4380 
gaccctgccg cttaccggat acctgtccgc 444 0 
tcatagctca cgctgtaggt atctcagttc 450 0 
tgtgcacgaa ccccccgttc agcccgaccg 4560 
gtccaacccg gtaagacacg acttatcgcc 4620 
cagagcgagg tatgtaggcg gtgctacaga 46 8 0 
cactagaagg acagtatttg gtatctgcgc 474 0 
agttggtagc tcttgatccg gcaaacaaac 4800 
caagcagcag attacgcgca gaaaaaaagg 4860 
ggggtctgac gctcagtgga acgaaaactc 492 0 
gtactaaaac aattcatcca gtaaaatata 4980 
ccccagtaag tcaaaaaata gctcgacata 5040 
ccggacgcag aaggcaatgt cataccactt 510 0 
aaagccactt actttgccat ctttcacaaa 5160 
aaagacaagt tcctcttcgg gcttttccgt 522 0 
tttaaatgga gtgtcttctt cccagttttc 5280 
taagtaatcc aattcggcta agcggctgtc 534 0 
gtcgatggag tgaaagagcc tgatgcactc 54 0 0 
ttgttcatct tcatactctt ccgagcaaag 5460 
gctccagcca tcatgccgtt caaagtgcag 552 0 
ccatagcatc atgtcctttt cccgttccac 5580 
cgtcattttt aaatataggt tttcattttc 564 0 
cattccttcc gtatctttta cgcagcggta 57 0 0 
tattctcatt ttagccattt attatttcct 5760 
caagaagcta attataacaa gacgaactcc 582 0 
aaataccaga aaacagcttt ttcaaagttg 58 8 0 
acggagccga ttttgaaacc gcggtgatca 594 0 
caacatgcta ccctccgcga gatcatccgt 6000 
ttccgaatag catcggtaac atgagcaaag 60 60 
cgccgtcccg gactgatggg ctgcctgtat 612 0 
c gg9<g a 9 ct 9 ttggctggct ggtggcagga 618 0 
gacaacttaa taacacattg cggacgtttt 624 0 
gggggatctg gattttagta ctggattttg 63 00 
agtattttac aaatacaaat acatactaag 63 6 0 
accctatagg aaccctaatt cccttatctg 642 0 
tcgagtcaaa tctcggtgac gggcaggacc 64 8 0 
agctgccaga aacccacgtc atgccagttc 654 0 
ccgcgggggg catatccgag cgcctcgtgc 66 0 0 
atgacagcga ccacgctctt gaagccctgt 66 6 0 
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gcctccaggg 
cggggggaga 
gggcccgcgt 
cgctcccgca 
aagttgaccg 
gcctcggfcgg 
gaga t aga 1 1 
ttccttatat 
agtggagata 
cacgatgctc 
aacgatagcc 
tgtccttttg 
taccctttgt 
cfcfcggagfcag 
agacgtggtt 
gggaccactg 
ttfcgtaggtg 
atggaatccg 
gtcttctgag 
gttggcaagc 
taatgcagct 
aatgtgagtt 
atgttgtgtg 
tacgaafctcg 
ggcactggcc 
tcgccttgca 
tcgcccttcc 
tcagattgtc 
ggtaaaccta 
ggtttatccg 



acttcagcag 
cgtacacggt 
aggcgatgcc 
gacggacgag 
tgcttgtctc 
cacggcggat 
tgtagagaga 
agaggaaggt 
tcacatcaat 
ctcgtgggtg 
tttcctttat 
atgaagtgac 
tgaaaagtct 
acgagagtgt 
ggaacgtctt 
tcggcagagg 
ccaccttcct 
aggaggtttc 
actgtatctt 
tgctctagcc 
ggcacgacag 
agctcactca 
gaattgfcgag 
agcfccggtac 
gtcgttttac 
gcacatcccc 
caacagttgc 
gtttcccgcc 
agagaaaaga 
ttcgtccatt 



gtgggtgfcag 
cgactcggcc 
ggcgacctcg 
gtcgtccgtc 
gatgtagtgg 
gtcggccggg 
gactggtgat 
cttgcgaagg 
ccacttgctt 
ggggfcccatc 
cgcaatgatg 
agatagctgg 
caatagccct 
cgtgctccac 
ctttttccac 
catcttgaac 
tttctactgt 
ccgatattac 
tgatattctt 
aatacgcaaa 
gtttcccgac 
ttaggcaccc 
cggat aacaa 
ccggggatcc 
aacgtcgtga 
ctttcgccag 
gcagcctgaa 
ttcagtttaa 
gcgtttatta 
tgtatgtg 



agcgtggagc 
gtccagtcgt 
ccgtccacct 
cactcctgcg 
tfcgacgatgg 
cgtcgttctg 
ttcagcgtgt 
atagtgggat 
tgaagacgtg 
tttgggacca 
gcatttgtag 
gcaatggaat 
tfcggtcttct 
catgttatca 
gatgctcctc 
gatagccttt 
ccttttgatg 
cctttgttga 
ggagtagacg 
ccgcctctcc 
tggaaagcgg 
caggctttac 
tttcacacag 
tctagagtcg 
ctgggaaaac 
ctggcgtaat 
tggcgaatgc 
actatcagtg 
gaataacgga 



ccagtcccgt 
aggcgttgcg 

cggcgacgag 

gtfccctgcgg 
tgcagaccgc 
ggc t catggt 
ccfccfcccaaa 
tgtgcgtcat 
gttggaacgt 
ctgtcggcag 
gtgccacctt 
ccgaggaggt 
gagactgtat 
catcaatcca 
gtgggtgggg 
cctttatcgc 
aagtgacaga 
aaagtctcaa 
agagtgtcgt 
ccgcgcgttg 
gcagtgagcg 
actttatgct 
gaaacagcta 
acctgcaggc 
cctggcgtta 
agcgaagagg 
tagagcagct 
tttgacagga 
tatttaaaag 



ccgctggtgg 
tgccttccag 
ccagggatag 
ctcggtacgg 
cggcatgtcc 
agactcgaga 
tgaaatgaac 
cccttacgtc 
cttctttttc 
aggcatcttg 
ccttttctac 
ttcccgatat 
ctttgatatt 
cttgctttga 
gtccatcttt 
aatgatggca 
tagctgggca 
tagccctttg 
gctccaccat 
gccgattcat 
caacgcaatt 
tccggctcgt 
tgaccatgat 
atgcaagctt 
cccaacttaa 
cccgcaccga 
tgagcttgga 
tatattggcg 
ggcgtgaaaa 



<210> 91 

<2X1> 3438 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pLIT3 8attBZeo Plasmid 



<4O0> 91 

tcgaccctct 

gtcgtgactg 

tcgccagctg 

gcctgaatgg 

gttaactacg 

tttctaaata 

ataatattga 

ttttgcggca 

tgctgaagat 

gatccttgag 

gctatgtggc 

acactattct 

tggcatgaca 

caacttactt 

gggggatcat 

cgacgagcgt 

tggcgaacta 

agttgcagga 

tggagccggt 

ctcccgtatc 

acagatcgct 

ctcatatata 

aagattgtat 

aatttttgtt 

aaatcaaaag 

cfcattaaaga 

ccactacgtg 



agtcaaggcc 
ggaaaaccct 
gcgtaatagc 
cgaatggcgc 
tcaggtggca 
cattcaaata 
aaaaggaaga 
ttttgccttc 
cagttgggtg 
agttttcgcc 
gcggtattat 
cagaatgact 
gtaagagaat 
ctgacaacga 
gtaactcgcc 
gacaccacga 
cttactctag 
ccacttctgc 
gagcgtgggt 
gtagttatct 
gagataggtg 
ctttagattg 
aagcaaatat 
aaatcagctc 
aatagcccga 
acgtggactc 
aaccatcacc 



ttaagtgagfc 
ggcgttaccc 
gaagaggccc 
ttcgcttggt 
cttttcgggg 
tgtatccgct 
gtatgagtat 
ctgtttttgc 
cacgagtggg 
c cgaagaacg 
cccgtgttga 
tggttgagta 
tatgcagtgc 
tcggaggacc 
ttgatcgttg 
tgcctgtagc 
cttcccggca 
gctcggccct 
ctcgcggtat 
acacgacggg 
cctcactgat 
atttaccccg 
ttaaattgta 
attttttaac 
gatagggttg 
caacgtcaaa 
caaatcaagt 



cgtattacgg 
aacttaatcg 
gcaccgatcg 
aataaagccc 
aaatgtgcgc 
catgagacaa 
tcaacatttc 
tcacccagaa 
ttacatcgaa 
ttctccaatg 
cgccgggcaa 
ctcaccagtc 
tgccataacc 
gaaggagcta 
ggaaccggag 
aatggcaaca 
acaattaata 
tccggctggc 
cattgcagca 
gagt caggca 
taagcattgg 
gttgataatc 
aacgttaata 
caataggccg 
agtgttgttc 
gggcgaaaaa 
tttttggggt 



actggccgtc 
ccttgcagca 
cccttcccaa 
gcttcggcgg 
ggaaccccta 
taaccctgat 
cgtgtcgccc 
acgctggtga 
ctggatctca 
atgagcactt 
gagcaactcg 
acagaaaagc 
atgagtgata 
accgcttttt 
ctgaatgaag 
acgttgcgca 
gactggatgg 
tggtttattg 
ctggggccag 
actatggatg 
taactgtcag 
agaaaagc c c 
ttttgttaaa 
aaatcggcaa 
cagtttggaa 
ccgtctatca 
cgaggtgccg 



gttttacaac 
catccccctt 
cagttgcgca 
gctttttttt 
tttgtttatt 
aaatgcttca 
ttattccctt 
aagt aaaaga 
acagcggtaa 
ttaaagttct 
gtcgccgcat 
atcttacgga 
acactgcggc 
tgcacaacat 
ccataccaaa 
aactattaac 
aggcggataa 
ctgataaatc 
atggtaagcc 
aacgaaa t ag 
accaagttta 
caaaaacagg 
attcgcgtta 
aatcccttat 
caagagtcca 
gggcgatggc 
taaagcacta 



6720 
6780 
6840 
6900 
6960 
7020 
7080 
7140 
7200 
7260 
7320 
7380 
7440 
7500 
7560 
7620 
7680 
7740 
7800 
7860 
7920 
7980 
8040 
8100 
8160 
8220 
8280 
8340 
8400 
8428 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 



aatcggaacc 
gaaaggaagg 
cgctgcgcgt 
atctaggtga 
ttccactgag 
ctgcgcgtaa 
ccggatcaag 
ccaaatactg 
ccgcctacat 
tcgtgtctta 
tgaacggggg 
tacctacagc 
tatccggtaa 
gcctggtatc 
tgatgctcgt 
ttcctggcct 
accccaggct 
acaatfcfccac 
ctagfcggggc 
tgctttttta 
ccggtgctca 
ttctcccggg 
ttcatcagcg 
cgcggcctgg 
gcctccgggc 
cgcgacccgg 
cgagatttcg 
gacgccggct 
aacttgttta 
aataaagcat 
tatcatgtct 



ctaaagggag 
gaagaaagcg 
aaccaccaca 
agatcctttt 
cgfccagaccc 
tctgctgctt 
agcfcaccaac 
ttcfctctagt 
accfccgcfcct 
ccgggttgga 
gttcgtgcac 
gtgagctatg 
gcggcagggt 
fcfctatagtcc 
caggggggcg 
tttgctggcc 
ttacacttta 
acaggaaaca 
ccgtgcaatt 
tactaacttg 
ccgcgcgcga 
acttcgtgga 
cggtccagga 
acgagctgta 
cggccatgac 
ccggcaactg 
attccaccgc 
ggatgat cc t 
ttgcagctta 
ttttttcact 
gtataccg 



cccccgattt 
aaaggagcgg 
cccgccgcgc 
tgafcaatctc 
cgtagaaaag 
gcaaacaaaa 
tctttttccg 
gtagccgtag 
gcfcaatccfcg 
ctcaagacga 
acagcccagc 
agaaagcgcc 
cggaacagga 
tgtcgggttt 
gagcctatgg 
ttttgctcac 
tgctfcccggc 
gctatgacca 
gaagccggct 
agcgaaat c t 
cgtcgccgga 
ggacgacttc 
ccaggtggtg 
cgccgagtgg 
cgagatcggc 
cgtgcacttc 
cgccttctat 
ccagcgcggg 
taatggttac 
gcattctagt 
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agagcttgac 
gcgctagggc 
ttaatgcgcc 
atgaccaaaa 
atcaaaggat 
aaaccaccgc 
aaggtaactg 
ttaggccacc 
ttaccagtgg 
tagttaccgg 
ttggagcgaa 
acgcttcccg 
gagcgcacga 
cgccacctct 
aaaaacgcca 
atgtaatgtg 
fccgtatgttg 
tgattacgcc 
ggcgccaagc 
ggat c c a t gg 
gcggtcgagt 
gccggtgtgg 
ccggacaaca 
tcggaggtcg 
gagcagccgt 
gtggccgagg 
gaaaggttgg 
gatctcatgc 
aaataaagca 
tgtggtttgt 



ggggaaagcg 
gctggcaagt 
gctacagggc 
tcccttaacg 
cttcttgaga 
taccagcggt 
gctfccagcag 
acttcaagaa 
ctgctgccag 
ataaggcgca 
cgacctacac 
aagggagaaa 
gggagcttcc 
gacttgagcg 
gcaacgcggc 
agttagctca 
tgfcggaattg 
aagctacgta 
ttctctgcag 
c c aagt t gac 
tctggaccga 
tccgggacga 
ccctggcctg 
tgtccacgaa 
gggggcggga 
agcaggactg 
gcttcggaat 
tggagtfccfcfc 
atagcafccac 
ccaaactcat 



aacgtggcga 
gtagcggfcca 
gcgtaaaagg 
tgagtfctfccg 
tccttttttt 
ggtttgtttg 
agcgcagata 
ctctgtagca 
tggcgataag 
gcggtcgggc 
cgaactgaga 
ggcggacagg 

agggggaaac 

tcgatttttg 
ctttttacgg 
ctcattaggc 
tgagcggata 
atacgacfcca 
gattgaagcc 
cagtgccgtt 
ccggctcggg 
cgtgaccctg 
ggtgtgggtg 
cttccgggac 
gttcgccctg 
acacgtgcta 
cgttttccgg 
cgcccacccc 
aaatttcaca 
caatgtatct 



1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3438 



<210> 92 

<211> 10549 

<212> DMA 

<213> Artificial Sequence 
<220> 

<223> pCambial302 Plasmid 
<300> 

<308> Geiibank #AF234398 

<309> 2000-04-24 



<400> 92 

catggtagat 

tgaattagat 

tgcaacatac 

gtggccaaca 

tcatatgaag 

gaccatcttc 

agacaccctc 

cctcggccac 

gcaaaagaac 

gcaactcgct 

agacaaccat 

ccacatggtc 

atacaaagct 

ccgatcgttc 

cgatgattat 

gcatgacgtt 

acgcgataga 

ctatgttact 

cctaagagaa 

tccgttcgtc 

ttgatccaac 

tctgaaaacg 



ctgactagta 

ggtgatgtta 

ggaaaactta 
cttgtcacta 
cggcacgact 
ttcaaggacg 
gtcaacagga 
a ag 1 1 ggaa t 
ggcatcaaag 
gatcattatc 
tacctgtcca 
cfcfccttgagt 
agccaccacc 
aaacatttgg 
catataattt 
atttatgaga 
aaacaaaata 
agatcgggaa 
aagagcgfctt 
catttgtatg 
ccctccgctg 
acatgtcgca 



aaggagaaga 
atgggcacaa 
cccttaaatt 
ctttctctta 
tcttcaagag 
acgggaacta 
tcgagcttaa 
acaactacaa 
ccaacttcaa 
aacaaaatac 
cacaatctgc 
ttgtaacagc 
accaccacca 
caafcaaagtt 
ctgttgaatt 
tgggttttta 
tagcgcgcaa 
ttaaactatc 
attagaataa 
tgcatgccaa 
ctatagtgca 
caagtcctaa 



acttttcact 
attttctgtc 
tatttgcact 
tggfcgfcfccaa 
cgccatgcct 
caagacacgt 
gggaatcgat 
ctcccacaac 
gacccgccac 
tccaattggc 
cctttcgaaa 
tgcfcgggatt 
cgtgtgaatt 
tcttaagatt 
acgttaagca 
tgattagagt 
actaggataa 
agfcgfc 1 fcgac 
cggatattta 
ccacagggtt 
gtcggcttct 
gttacgcgac 



ggagttgtcc 
agtggagagg 
actggaaaac 
tgcttttcaa 
gagggatacg 
gctgaagtca 
ttcaaggagg 
gfcatacafcca 
aacatcgaag 
gatggccctg 
gatcccaacg 
acacatggca 
ggtgaccagc 
gaatcctgtt 
tgtaataatt 
cccgcaatta 
attatcgcgc 
aggatatatt 
aaagggcgtg 
cccctcggga 
gacgttcagt 
aggctgccgc 



caattcttgt 
gtgaaggtga 
tacctgttcc 
gatacccaga 
tgcaggagag 
agtttgaggg 
acggaaacat 
tggccgacaa 
acggcggcgt 
tccttttacc 
aaaagagaga 
tggatgaact 
tcgaatttcc 
gccggtcttg 
aacatgtaat 
tacatttaat 
gcggtgtcat 
ggcgggtaaa 
aaaaggt.tta 
tcaaagtact 
gcagccgtct 
cctgcccttt 



60 

120 

180 

240 

300 

360 

420 

480 

540 

6O0 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 
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tcctggcgtt 
cggagacatt 
agcaccgacg 
aagctgtttt 
cttgaccacc 
agcacccgcg 
agcctggcag 
ttcgccggca 
gaggccgcca 
atcgcgcacg 
ctgcttggcg 
cccaccgagg 
ctggcggccg 
aggacgaacc 
tgttcgagcc 
ctgatgccaa 
gtctaaaaag 
tgatgcgatg 
taaccagaaa 
ac t cgc cggg 
ggcggccgtg 
ccgcgacgtg 
ggcggacttg 
aagcccttac 
ggt c acgga t 
catcggcggt 
tatcacgcag 
agaacccgag 
actcatttga 
ggccgtccga 
gccatgaagc 
gcggtacgcc 
gagtaaatga 
ggaaaatcaa 
cggttggcca 
caagcccgag 
cgctgggtga 
tcgaggcaga 
aatcccggca 
agcaaccaga 
tcatggacgt 
gctacgagct 

tgtgggatta 

accgggaagg 
tcaagttctg 
ttcggttaaa 
tggtgacggt 
ccgggcggcc 
aaggcaagaa 
tcggccgttt 
tgttcaagac 
ccgtgcgcaa 
ggcaggctgg 
ccggttccta 
gaaaaggtct 
accggaaccc 
tgactgatat 
aaactcttaa 
tgc aaaaagc 
ctatcgcggc 
gcggacaagc 
gcgcgtttcg 
gcttgtctgt 
ggcgggtgtc 
ttaactatgc 
cgcacagatg 
actcgctgcg 



ttcttgtcgc 
acgccatgaa 
accaggactt 
ccgagaagat 
tacgccctgg 
acctactgga 
agccgtgggc 
ttgccgagtt 
aggcccgagg 
cccgcgagct 
tgcatcgctc 
ccaggcggcg 
ccgagaatga 
gtttttcatt 
gcccgcgcac 
gctggcggcc 
gtgatgtgta 
agtaaataaa 

ggcgggtcag 

gccgatgttc 
cgggaaga t c 
aaggccatcg 
gctgtgtccg 
gacatatggg 
ggaaggctac 
gaggttgccg 
cgcgtgagct 
ggcgacgctg 
gttaatgagg 
gcgcacgcag 
gggtcaactt 
aaggcaagac 
gcaaatgaat 
gaacaaccag 
ggcgtaagcg 
gaatcggcgt 
tgacctggtg 
agcacgcccc 
accgccggca 
ttttttcgtt 
ggccgttttc 
tccagacggg 
cgacctggta 
gaagggagac 
ccggcgagcc 
caccacgcac 
atccgagggt 
ggagtacatc 
cccggacgtg 
tctctaccgc 
gatctacgaa 
gctgatcggg 
cccgatccta 
atgtacggag 
ctttcctgtg 
gtacattggg 
aaaagagaaa 
aacccgcctg 
gcctaccctt 
cgctggccgc 
cgcgccgtcg 
gtgatgacgg 
aagcggatgc 
ggggcgcagc 
ggcatcagag 
cgtaaggaga 
ctcggtcgtt 



gtgttttagt 
caagagcgcc 
gaccaaccaa 
caccggcacc 
cgacgttgtg 
cattgccgag 
cgacaccacc 
cgagcgttcc 
cgtgaagttt 
gatcgaccag 
gaccctgtac 
cggtgccttc 
acgccaagag 
accgaagaga 
gtctcaaccg 
tggccggcca 
tttgagtaaa 
caaatacgca 
gcaagacgac 
tgttagtcga 
aaccgctaac 
gccggcgcga 
cgatcaaggc 
ccaccgccga 
aagcggcctt 
aggcgctggc 
acccaggcac 
cccgcgaggt 
taaagagaaa 
cagcaaggct 
tcagttgccg 
cattaccgag 
aaatgagtag 
gcaccgacgc 
gctgggttgt 
gacggtcgca 
gagaagt tga 
ggtgaatcgt 
gccggtgcgc 
ccgatgctct 
cgtctgtcga 
cacgtagagg 
ctgatggcgg 
aagcccggcc 
gatggcggaa 
gttgccatgc 
gaagccttga 
gagatcgagc 
ctgacggttc 
ctggcacgcc 
cgcagtggca 
tcaaatgacc 
gtcatgcgct 
cagatgctag 
gatagcacgt 
aacccaaagc 
aaaggcgatt 
gcctgtgcat 
cggtcgctgc 
tcaaaaatgg 
ccactcgacc 
tgaaaacctc 
cgggagcaga 
catgacccag 
cagattgtac 
aaataccgca 
cggctgcggc 



cgcataaagt 
gccgctggcc 
cgggccgaac 
aggcgcgacc 
acagtgacca 
cgcatccagg 
acgccggccg 
ctaatcatcg 
ggcccccgcc 
gaaggccgca 
cgcgcacttg 
cgtgaggacg 
gaacaagcat 
tcgaggcgga 
tgcggctgca 
gcttggccgc 
acagcttgcg 
aggggaacgc 
catcgcaacc 
ttccgatccc 
cgttgtcggc 
cttcgtagtg 
agccgacttc 
cctggtggag 
tgtcgtgtcg 
cgggtacgag 
tgccgccgcc 
ccaggcgctg 
atgagcaaaa 
gcaacgttgg 
gcggaggatc 
ctgctatctg 
atgaatttta 
cgtggaatgc 
ctgccggccc 
aaccatccgg 
aggccgcgca 
ggcaagcggc 
cgtcgattag 
atgacgtggg 
agcgtgaccg 
tttccgcagg 
tttcccatct 
gcgtgttccg 
agcagaaaga 
agcgtacgaa 
ttagccgcta 
tagctgattg 
accccgatta 
gcgccgcagg 
gcgccggaga 
tgccggagta 
accgcaacct 
ggcaaattgc 
acattgggaa 
cgtacattgg 
tttccgccta 
aactgtctgg 
gctccctacg 
ctggcctacg 
gccggcgccc 
tgacacatgc 
caagcccgtc 
tcacgtagcg 
tgagagtgca 
tcaggcgctc 
gagcggtatc 



agaatacttg 
tgctgggcta 
tgcacgcggc 
gcccggagct 
ggctagaccg 
aggccggcgc 
gccgcatggt 
accgcacccg 
ctaccctcac 
ccgtgaaaga 
agcgcagcga 
cattgaccga 
gaaaccgcac 
gatgatcgcg 
tgaaatcctg 
tgaagaaacc 
tcatgcggtc 
atgaaggtta 
catctagccc 
cagggcagtg 
atcgaccgcc 
atcgacggag 
gtgctgattc 
ctggttaagc 
cgggcgatca 
ctgcccattc 
ggcacaaccg 
gccgctgaaa 
gcacaaacac 
ccagcctggc 
acaccaagct 
aatacatcgc 
gcggctaaag 
cccatgtgtg 
tgcaatggca 
cccggtacaa 
ggccgcccag 
cgctgatcga 
gaagccgccc 
cacccgcgat 
acgagctggc 
gccggccggc 
aaccgaatcc 
tccacacgtt 
cgacctggta 
gaaggccaag 
caagatcgta 
gatgtaccgc 
ctttttgatc 
caaggcagaa 
gttcaagaag 
cgatttgaag 
gatcgagggc 
cctagcaggg 
cccaaagccg 
gaaccggtca 
aaactcttta 
ccagcgcaca 
ccccgccgct 
gccaggcaat 
acatcaaggc 
agctcccgga 
agggcgcgtc 
atagcggagt 
ccatatgcgg 
ttccgcttcc 
agctcactca 



cgactagaac 
tgcccgcgtc 
cggctgcacc 
ggccaggatg 
cctggcccgc 
gggcctgcgt 
gttgaccgtg 
gagcgggcgc 
cccggcacag 
ggcggctgca 
ggaagtgacg 
ggccgacgcc 
caggacggcc 
gccgggtacg 
gccggtttgt 
gagcgccgcc 
gctgcgtata 
tcgctgtact 
gcgccctgca 
cccgcgattg 
cgacgattga 
cgccccaggc 
cggtgcagcc 
agcgcattga 
aaggcacgcg 
ttgagtcccg 
ttcttgaatc 
ttaaatcaaa 
gctaagtgcc 
agacacgcca 
gaagatgtac 
gcagctacca 
gaggcggcat 
gaggaacggg 
ctggaacccc 
atcggcgcgg 
cggcaacgca 
atccgcaaag 
aagggcgacg 
agtcgcagca 
gaggtgat cc 
atggccagtg 
atgaaccgat 
gcggacgtac 
gaaacctgca 
aacggccgcc 
aagagcgaaa 
gagatcacag 
gatcccggca 
gccagatggt 
ttctgtttca 
gaggaggcgg 
gaagcatccg 
gaaaaaggtc 
tacattggga 
cacatgtaag 
aaacttatta 
gccgaagagc 
tcgcgtcggc 
ctaccagggc 
accctgcctc 
gacggtcaca 
agcgggtgtt 
gtatactggc 
tgtgaaatac 
tcgctcactg 
aaggcggtaa 



1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
24 0 0 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 
5100 
5160 
5220 
5280 
5340 
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tacggttatc 
aaaaggccag 
ctgacgagca 
aaagatacca 
cgcttaccgg 
cacgctgtag 
aaccccccgt 
cggtaagaca 
ggtatgtagg 
ggacagtatt 
gctcttgatc 
agattacgcg 
acgctcagtg 
acaattcatc 
agtcaaaaaa 
agaaggcaat 
ttactttgcc 
gttcctcttc 
gagtgtcttc 
ccaattcggc 
agtgaaagag 
cttcatactc 
catcatgccg 
tcatgtcctt 
ttaaatatag 
ccgtatcttt 
ttttagccat 
taattataac 
gaaaacagct 
gattttgaaa 
taccctccgc 
agcatcggta 
cggactgatg 
tgttggctgg 
aataacacat 
tggattttag 
acaaatacaa 
ggaaccctaa 
gtcgatcgac 
gcgtcggttt 
tctgcgggcg 
tcgaccctgc 
gtcaagacca 
cctccgctcg 
gatgttggcg 
tgttatgcgg 
ccggacttcg 
cgcactgacg 
gcatatgaaa 
cccgctcgtc 
tagaacagcg 
ggagatgc aa 
gagcgcggcc 
gctatttacc 
ttcgccctcc 
ctcgacagac 
gaaagctcga 
aatgaaatga 
atcccttacg 
gtcttctttt 
ag&99catct 
ttccttttct 
gtttcccgat 
atctttgata 
cacttgcttt 

gggtccatct 

gcaatgatgg 



cacagaatca 
gaaccgtaaa 
tcacaaaaat 
ggcgtttccc 
atacctgtcc 
gtatctcagt 
tcagcccgac 
cgacttatcg 
cggtgctaca 
tggtatctgc 
cggcaaacaa 
cagaaaaaaa 
gaacgaaaac 
cagtaaaata 
tagctcgaca 
gtcataccac 
atctttcaca 
gggcttttcc 
ttcccagttt 
taagcggctg 
cctgatgcac 
ttccgagcaa 
ttcaaagtgc 
ttcccgttcc 
gttttcattt 
tacgcagcgg 
ttattatttc 
aagacgaact 
ttttcaaagt 
ccgcggtgat 
gagatcatcc 
acatgagcaa 
ggctgcctgt 
ctggtggcag 
tgcggacgtt 
tactggattt 
atacatacta 
ttcccttatc 
agatccggtc 
ccactatcgg 
atttgtgtac 
gcccaagctg 
atgcggagca 
aagtagcgcg 
acctcgtatt 
ccattgtccg 
gggcagtcct 
gtgtcgtcca 
tcacgccatg 
tggctaagat 
ggcagttcgg 
taggtcaggc 
gatgcaaagt 
cgcaggacat 
gagagctgca 
gtcgcggtga 
gagagataga 
acttccttat 
tcagtggaga 
tccacgatgc 
tgaacgatag 
actgtccttt 
attacccttt 
ttcttggagt 
gaagacgtgg 
ttgggaccac 
catttgtagg 



ggggataacg 
aaggccgcgt 
cgacgctcaa 
cctggaagct 
gcctttctcc 
tcggtgtagg 
cgctgcgcct 
ccactggcag 
gagttctfcga 
gctctgctga 
accaccgctg 
ggatctcaag 
tcacgttaag 
taatatttta 
tactgttctt 
ttgtccgccc 
aagatgttgc 
gtctttaaaa 
tcgcaatcca 
tctaagctat 
tccgcataca 
aggacgc cat 
aggacctttg 
acatcatagg 
tctcccacca 
tatttttcga 
cttcctcttt 
ccaattcact 
tgttttcaaa 
cacaggcagc 
gtgtttcaaa 
agtctgccgc 
atcgagtggt 
gatatattgt 
tttaatgtac 
tggttttagg 
agggtttctt 
tgggaactac 
ggcatctact 
cgagtacttc 
gcccgacagt 
catcatcgaa 
tatacgcccg 
tctgctgctc 
gggaatcccc 
tcaggacatt 
cggcccaaag 
tcacagtttg 
tagtgtattg 
cggccgcagc 
tttcaggcag 
tctcgctaaa 
gccgataaac 
atccacgccc 
tcaggtcgga 
gttcaggctt 
tttgtagaga 
a t agaggaag 
tatcacatca 
tcctcgtggg 
cctttccttt 
tgatgaagtg 
gttgaaaagt 
agacgagagt 
ttggaacgtc 
tgtcggcaga 
tgccaccttc 



caggaaagaa 
tgctggcgtt 
gtcagaggtg 
ccctcgtgcg 
cttcgggaag 
tcgttcgctc 
tatccggtaa 
cagccactgg 
agtggtggcc 
agccagttac 
gtagcggtgg 
aagatccttfc 
ggattttggt 
ttttctccca 
ccccgatatc 
tgccgcttct 
tgtctcccag 
aatcatacag 
catcggccag 
tcgtataggg 
gctcgataat 
cggcctcact 
gaacaggcag 
tggtcccttt 
gcttatatac 
tcagtttttt 
tctacagtat 
gttccttgca 
gttggcgtat 
aacgctctgt 
cccggcagct 
cttacaacgg 
gattttgtgc 
ggtgtaaaca 
tgaattaacg 
aattagaaat 
atatgctcaa 
tcacacatta 
ctatttcttt 
tacacagcca 
cccggctccg 
attgccgtca 
gagtcgtggc 
catacaagcc 
gaacatcgcc 
gttggagccg 
catcagctca 
ccagtgatac 
accgattcct 
gatcgcatcc 
gtcttgcaac 
ctccccaatg 
ataacgatct 
tcctacatcg 
gacgctgtcg 
tttcatatct 
gagactggtg 
gtcttgcgaa 
atccacttgc 

tgggggtcca 

atcgcaatga 
acagatagct 
ctcaatagcc 
gtcgtgctcc 
ttctttttcc 
ggcatcttga 
cttttctact 



catgtgagca 
tttccatagg 
gcgaaacccg 
ctctcctgtt 
cgfcggcgctt 
caagctgggc 
ctatcgtctt 
taacaggatt 
taactacggc 
cttcggaaaa 
tttttttgtt 
gatcttttct 
catgcattct 
atcaggcttg 
ctccctgatc 
cccaagatca 
gtcgccgtgg 
ctcgcgcgga 
atcgttattc 
acaafcccgat 
cttttcaggg 
catgagcaga 
ctttccttcc 
ataccggctg 
cttagcagga 
caattccggt 
ttaaagatac 
ttctaaaacc 
aacatagtat 
catcgttaca 
tagttgccgt 
ctctcccgct 
cgag c t gc eg 
aattgacget 
ccgaattaat 
tttattgata 
cacatgagcg 
ttatggagaa 
gccctcggac 
teggtccaga 
gateggaega 
accaagctct 
gatcctgeaa 
aaccacggcc 
tcgctccagt 
aaatccgcgt 
tegagagect 
acatggggat 
tgeggtcega 
atagcctccg 
gtgacaccct 
tcaagcactt 
ttgtagaaac 
aagctgaaag 
aacttttcga 
cattgccccc 
atttcagegt 
ggatagtggg 
tttgaagacg 
tctttgggac 
tggcatttgt 
gggcaatgga 
ctttggtctt 
accatgttat 
acgatgctcc 
aegatagect 
gtccttttga 



aaaggecage 
ctccgccccc 
acaggactat 
ccgaccctgc 
tctcatagct 
tgtgtgcacg 
gagtccaacc 
ageagagega 
tacactagaa 
agagttggta 
tgeaagcage 
aeggggtctg 
aggtactaaa 
atccccagta 
gaccggacgc 
ataaagecac 
gaaaagacaa 
tctttaaatg 
agtaagtaat 
atgtcgatgg 
ctttgttcat 
ttgctccagc 
agecat agca 
teegtcattt 
gacattcctt 
gatattctca 
cccaagaagc 
ttaaatacca 
cgacggagcc 
ateaacatge 
tcttccgaat 
gacgccgtcc 
gteggggage 
tagacaactt 
tegggggate 
gaagtatttt 
aaaccctata 
actcgagctt 
gagtgctggg 
cggccgcgct 
ttgegtcgea 
gat agag 1 1 g 
getceggatg 
tccagaagaa 
caatgaccgc 
gcacgaggtg 
gcgcgacgga 
cagcaatcgc 
a t gggc cgaa 
cgaccggttg 
gtgcacggcg 
ceggaategg 
catcggcgca 
cacgagattc 
tcagaaactt 
egggatctge 
gtcctctcca 
attgtgcgtc 
tggttggaac 
cactgtcggc 
aggtgccacc 
atccgaggag 
ctgagactgt 
cacatcaatc 
tcgtgggtgg 
ttcctttatc 
tgaagtgaca 



5400 
5460 
5520 
5580 
5640 
5700 
5760 
5820 
58B0 
5940 
6000 
6060 
6120 
6180 
6240 
6300 
6360 
6420 
6480 
6540 
6600 
6660 
6720 
6780 
6840 
6900 
6960 
7020 
7080 
7140 
7200 
7260 
7320 
7380 
7440 
7500 
7560 
7620 
7680 
7740 
7800 
7860 
7920 
7980 
8040 
8100 
8160 
8220 
8280 
8340 
8400 
8460 
8520 
8580 
8640 
8700 
8760 
8820 
8880 
8940 
9000 
9060 
9120 
9180 
9240 
9300 
9360 



gatagctggg 
aatagccctt 
gtgctccacc 
tggccgattc 
cgcaacgcaa 
cttccggctc 
tatgaccatg 
gcatgcaagc 
tacccaactt 
ggcccgcacc 
cttgagcttg 
tcaaatagag 
cttacgactc 
ctactccaaa 
acaaagggta 
tgtgaagata 
ggccatcgtt 
gagcatcgtg 
tatctccact 
tatataagga 



caatggaatc 
tggtcttctg 
atgttggcaa 
attaatgcag 
ttaatgtgag 
gtatgttgtg 
attacgaatt 
ttggcactgg 
aatcgccttg 
gatcgccctt 
gatcagattg 
gacctaacag 
aatgacaaga 
aatatcaaag 
atatccggaa 
gtggaaaagg 
gaagatgcct 
gaaaaagaag 
gacgtaaggg 
agttcatttc 



cgaggaggtt 
agactgtatc 
gctgctctag 
c tggcacgac 
ttagctcact 
tggaattgtg 
cgagctcggt 
ccgtcgtttt 
cagcacatcc 
cccaacagtt 
tcgtttcccg 
aactcgccgt 
agaaaatctt 
atacagtctc 
acctcctcgg 
aaggtggcfcc 
ctgccgacag 
acgttccaac 
atgacgcaca 
atttggagag 
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tcccgatafct 
tttgatattc 
ccaatacgca 
aggtttcccg 
cattaggcac 
agcggataac 
acccggggat 
acaacgtcgt 
ccctttcgcc 
gcgcagcctg 
ccttcagttt 
aaagactggc 
cgtcaacatg 
agaagaccaa 
attccattgc 
ctacaaatgc 
tggtcccaaa 
cacgtcttca 
atcccactat 
aacacggggg 



accctttgtt 
ttggagtaga 
aaccgcctct 
actggaaagc 
cccaggcttt 
aatttcacac 
cctctagagt 
gactgggaaa 
agctggcgta 
aatggcgaat 
agcttcahgg 
gaacagttca 
gtggagcacg 
agggcaat t g 
ccagctatct 
catcattgcg 
gatggacccc 
aagcaagtgg 
ccttcgcaag 
actcttgac 



gaaaag t ct c 
cgagagtgtc 
ccccgcgcgt 
gggcagtgag 
acactttatg 
aggaaacagc 
cgaccfcgcag 
accctggcgt 
atagcgaaga 
gctagagcag 
agtcaaagat 
tacagagtct 
acacacttgt 
agacttttca 
gtcactttat 
at aaaggaaa 
cacccacgag 
at tgat gtga 
acccttcctc 



<210> 93 
<211> 33 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> CaMV3 5SpolyA Primer 
<400> 93 

ctgaattaac gccgaattaa ttcgggggat 

<210> 94 

<211> 29 

<212> DNA 

<213> Artificial Sequence 



ctg 



9420 

9480 

9540 

9600 

9660 

9720 

9780 

9840 

9900 

9960 

10020 

10080 

10140 

1020O 

10260 

10320 

10380 

10440 

10500 

10549 



33 



<220> 

< 2 2 3 > CaMV35Spr Primer 
<400> 94 

ctagagcagc ttgccaacat ggtggagca 

<210> 95 
<211> 12592 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pAg2 Plasrnid 



29 



<400> 95 

gtacgaagaa 

gccgctacaa 

ctgattggat 

ccgattactt 

ccgcaggcaa 

ccggagagtt 

cggagtacga 

gcaacctgat 

aaattgccct 

ttgggaaccc 

acattgggaa 

ccgcctaaaa 

tgtctggcca 

ccctacgccc 

gcctacggcc 



ggccaagaac 
gatcgtaaag 
gtaccgcgag 
tttgatcgat 
ggcagaagcc 
caagaagttc 
tttgaaggag 
cgagggcgaa 
agcaggggaa 
aaagccgtac 
ccggtcacac 
cfccfcttaaaa 
gcgcacagcc 
cgccgcttcg 
aggcaatcta 



ggccgcctgg 
agcgaaaccg 
a t cac agaag 
cccggcatcg 
agatggttgt 
tgtttcaccg 
gaggcggggc 
gcatccgccg 
aaaggtcgaa 
attgggaacc 
atgtaagtga 
cttattaaaa 
gaagagctgc 
cgtcggccta 
ccagggcgcg 



tgacggtatc 
ggcggccgga 
gcaagaaccc 
gccgttttct 
tcaagacgat 
tgcgcaagct 
aggctggccc 
gttcctaatg 
aaggtctctt 
ggaacccgta 
ctgatataaa 
ctcttaaaac 
aaaaagcgcc 
tcgcggccgc 
gacaagccgc 



cgagggtgaa 
gtacatcgag 
ggacgtgctg 
ctaccgcctg 
ctacgaacgc 
gatcgggtca 
gatcctagtc 
tacggagcag 
tcctgtggat 
cattgggaac 
agagaaaaaa 
ccgcctggcc 
tacccttcgg 
tggccgctca 
gccgtcgcca 



gccttgatta 
at cgagc t ag 
acggttcacc 
gcacgccgcg 
agtggcagcg 
aatgacctgc 
atgcgctacc 
atgctagggc 
agcacgtaca 
ccaaagccgt 
ggcgattttt 
tgtgcataac 
tcgctgcgct 
aaaatggctg 
ctcgaccgcc 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 
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ggcgcccaca 
cacatgcagc 
gcccgtcagg 
cgtagcgata 
gagtgcacca 
ggcgctcttc' 
cggtatcagc 
gaaagaacat 
tggcgtttfct 
agaggtggcg 
tcgtgcgctc 
cgggaagcgt 
ttcgctccaa 
ccggtaacta 
ccactggtaa 
ggtggcctaa 
cagttacctt 
gcggtggttt 
atcctttgat 
tfctfcggtcafc 
tctcccaatc 
cgafcatcctc 
cgcttctccc 
ctcccaggtc 
catacagctc 
cggccagatc 
tatagggaca 
cgataatctt 
cctcactcat 
caggcagcfcfc 
tccctttata 
tatatacctt 
gfcfcfcfcttcaa 
acagtattta 
ccttgcattc 
ggcgt at aac 
gctctgtcat 
ggcagcttag 
acaacggctc 
tttgtgccga 
gfcaaacaaafc 
attaacgccg 
tagaaatttt 
tgctcaacac 
cacattatta 
taccggcagg 
gccggccgcc 
cgggtcgttg 
cttcagcagg 
gtacacggtc 
ggcgatgccg 
acggacgagg 
gcttgtctcg 
acggcggatg 
gtagagagag 
gaggaaggtc 
cacatcaatc 
tcgfcgggtgg 
ttcctttatc 
tgaagtgaca 
gaaaagtctc 
cgagagtgtc 
gaacgtcttc 
cggcagaggc 
caccttcctt 
ggaggtttcc 
ctgtatcttt 



tcaaggcacc 
tcccggagac 
gcgcgtcagc 
gcggagtgta 
tatgcggtgt 
cgcttcctcg 
tcactcaaag 
gtgagcaaaa 
ccataggctc 
aaacccgaca 
tcctgttccg 
ggcgctttct 
gctgggctgt 
tcgtcttgag 
caggattagc 
ctacggctac 
cggaaaaaga 
ttttgtttgc 
cttttctacg 
gcattctagg 
aggcttgatc 
cctgatcgac 
aagatcaata 
gc eg t gggaa 
gegeggatet 
gttattcagt 
atccgatatg 
ttcagggctt 
gagcagattg 
tccttccagc 
ccggctgtcc 
agcaggagac 
tfcccggtgat 
aagatacccc 
taaaacctta 
atagtatcga 
cgttacaatc 
ttgccgttct 
tcccgctgac 
gctgccggtc 
tgaegcttag 
aattaattcg 
attgatagaa 
atgagcgaaa 
tggagaaact 
ctgaagtcca 
cgcagcatgc 
ggcagcccga 
tgggtgtaga 
gactcggccg 
gcgacctcgc 
tcgtccgtcc 
atgtagtggt 
tcggccgggc 
actggtgatt 
t tgcgaagga 
cacttgettt 
gggtccatct 
gcaatgatgg 
gatagctggg 
aatagecett 
gtgctccacc 
tttttccacg 
atcttgaacg 
ttctactgtc 
cgatattacc 
gatattcttg 



ctgcctcgcg 
ggtcacagct 
gggtgttggc 
tactggctta 
gaaatacege 
ctcactgact 
geggtaatae 
ggccagcaaa 
cgcccccctg 
ggactataaa 
accctgccgc 
catagctcac 
gtgeacgaac 
tccaacccgg 
agagegaggt 
actagaagga 
gttggtagct 
aagcagcaga 
ggg t c tga eg 
fcaefcaaaaca 
cccagtaagfc 
eggaegcaga 
aagecactta 
aagacaagt t 
1 1 aaatggag 
aagtaatcca 
tcgatggagt 
tgttcatctt 
ctccagccat 
catagcatca 
gtcattttta 
attccttccg 
attctcattt 
aagaagctaa 
aataccagaa 
eggagecgat 
aacatgetae 
t c cgaat age 
gccgtcccgg 

ggggagctgt 

acaacttaat 

ggggatctgg 

gtattttaca 
ccctatagga 
cgagtcaaat 
getgecagaa 
cgegggggge 
tgacagegae 
gcgtggagcc 
tecagtegta 
cgtccacctc 
actcctgcgg 
t gacga t ggt 
gtcgttctgg 
tcagcgtgtc 
tagtgggatt 
gaagacgtgg 
ttgggaccac 
catttgtagg 
caatggaatc 
tggtcttctg 
atgttatcac 
atgctcctcg 
atagecttte 
cttttgatga 
ctttgttgaa 
gagtagacga 



cgtfctcggtg 
tgtctgtaag 
gggtgtcggg 
actatgegge 
acagatgegt 
cgctgcgctc 
ggttatccac 
aggecaggaa 
acgagcatca 
gataccaggc 
ttaceggata 
gctgtaggta 
cccccgttca 
fcaagacacga 
atgtaggcgg 
cagtatttgg 
ettgatcegg 
ttacgcgcag 
ctcagtggaa 
attcatccag 
caaaaaatag 
aggcaatgtc 
ctttgccatc 
cctcttcggg 
tgtcttcttc 
atteggctaa 
gaaagagect 
catactcttc 
catgccgttc 
tgtccttttc 
aatataggtt 
tatcttttac 
tagecattta 
ttataacaag 
aacagctttt 
tttgaaaccg 
cctccgcgag 
ateggtaaca 
ac tgatgggc 
tggctggctg 
aacacattgc 
attttagtac 
aatacaaata 
accctaattc 
ctcggtgacg 
acccacgtca 
atatccgagc 
cacgctcttg 
cagtcccgtc 
ggcgttgcgt 
ggegacgage 
ttcctgcggc 
gcagaccgcc 
gctcatggta 
ctctccaaat 
gtgegtcate 
ttggaacgtc 
tgteggcaga 
tgccaccttc 

cgaggaggtt 

agactgtatc 
atcaatccac 
tgggtggggg 
etttatcgea 
agtgacagat 
aagtctcaat 
gagtgtcgtg 



atgacggtga 
eggatgeegg 
gcgcagccat 
atcagagcag 
aaggagaaaa 
ggtcgttcgg 
agaa t caggg 
ccgtaaaaag 
caaaaatcga 
gtttccccct 
cctgtccgcc 
tctcagttcg 
gcccgaccgc 
cttatcgcca 
tgctacagag 
tatctgeget 
caaacaaacc 
aaaaaaagga 
cgaaaactca 
taaaatataa 
ctcgacatac 
ataccacttg 
tttcacaaag 
cttttccgtc 
ccagttttcg 
gcggctgtct 
gafcgcactcc 
cgagcaaagg 
aaagtgcagg 
ccgttccaca 
ttcattttct 
geageggtat 
ttatttcctt 
acgaactcca 
tcaaagttgt 
eggtgatcac 
ateatcegtg 
tgagcaaagt 
tgcctgtatc 
gtggcaggat 
ggacgttttt 
tggattttgg 
catactaagg 
ccttatctgg 
ggcaggaccg 
tgccagttcc 
gcctcgtgca 
aagccctgtg 
cgctggtggc 
gccttccagg 
cagggatagc 
teggtaegga 
ggcatgtccg 
gactcgagag 
gaaatgaact 
ccttacgtca 
ttctttttcc 
ggcatcttga 
cttttctact 
tcccgatatt 
tttgatattc 
ttgctttgaa 
tccatctttg 
atgatggcat 
agctgggcaa 
agecctttgg 
ctccaccatg 



aaacctctga 
gagcagacaa 
gacccagtca 
attgtactga 
taccgcatca 
ctgeggegag 
gataaegcag 
gccgcgttgc 
cgctcaagtc 
ggaagctccc 
tttctccctt 
gtgtaggtcg 
tgegecttat 
ctggcagcag 
ttcttgaagt 
ctgetgaage 
accgctggta 
tctcaagaag 
cgtt aaggga 
tattttattt 
tgttcttccc 
tccgccctgc 
atgttgctgt 
tttaaaaaat 
caatccacat 
aagctattcg 
gcatacagct 
acgccatcgg 
acctttggaa 
fccataggtgg 
cccaccagct 
ttttcgatca 
cctcttttct 
attcactgtt 
tttcaaagtt 
aggcagcaac 
tttcaaaccc 
ctgccgcctt 
gagtggtgat 
atattgtggt 
aatgtactga 
ttttaggaat 
gtttcttata 
gaactactca 
gaeggggegg 
cgtgcttgaa 
tgcgcacgct 
cctccaggga 

grgggggagac 

ggcccgcgta 
gctcccgcag 
agttgaccgt 
cctcggtggc 
agatagattt 
tccttatata 
gtggagatat 
acgatgctcc 
aegatagect 
gtccttttga 
accctttgtt 
ttggagtaga 
gacgtggttg 
ggaccactgt 
ttgtaggtgc 
tggaatccga 
tcttctgaga 
ttggcaagct 



960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

4020 

4080 

4140 

4200 

4260 

4320 

4380 

4440 

4500 

4560 

4620 

4680 

4740 

4800 

4860 

4920 
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gctctagcca 
gcacgacagg 
gctcactcat 
aattgtgagc 
gccttgacta 
aaccacaact 
tttatttgta 
tgagatcccc 
acctttcata 
cggccacgaa 
gctgctcgcc 
cctccgacca 
tgtccggcac 
caccggcgaa 
cgaccgctcc 
tggatccaga 
at cgacactc 
attgagactt 
atctgtcact 
tgcgataaag 
cccccaccca 
gtggattgat 
gatacagtct 
aacctcctcg 
gaaggtggca 
tctgccgaca 
gacgttccaa 
gatgacgcac 
atttggagag 
ttcgcagatc 
gagaagtttc 
gaagaatctc 
agctgcgccg 
ctcccgattc 
tcccgccgtg 
ctacaaccgg 
gggttcggcc 
tgcgcgattg 
gcgtccgtcg 
cggcacctcg 
acagcggtca 
atcttcttct 
aggcatccgg 
gaccaactct 
cgatgcgacg 
agaagcgcgg 
cgccccagca 
gacaagctcg 
tcctataggg 
tgtaaaatac 
ccagatcccc 
tacaacgtcg 
cccctttcgc 
tgcgcagcct 
gccttcagtt 
agaattaagg 
tggaactgac 
tgagctaagc 
atcagctagc 
gtatccaatt 
atcgaattcc 
gagttgtccc 
gtggagaggg 
ctggaaaact 
gcttttcaag 
agggatacgt 
ctgaagtcaa 



atacgcaaac 
tttcccgact 
taggcacccc 
ggataacaat 
gagggtcgac 
agaatgcagt 
accattataa 
gcgctggagg 
gaaggcggcg 
gtgcacgcag 
gatctcggtc 
ctcggcgtac 
cacctggtcc 
gtcgtcctcc 
ggcgacgtcg 
tttcgctcaa 
tcgtctactc 
ttcaacaaag 
tcatcaaaag 
gaaaggctat 
cgaggagcat 
gtgataacat. 
cagaagacca 
gattccattg 
cctacaaatg 
gtggtcccaa 
ccacgtcttc 
aatcccacta 
gacacgctga 
cgggggggca 
tgatcgaaaa 
gtgctttcag 
atggtttcta 
cggaagtgct 
cacagggtgt 
tcgcggaggc 
cattcggacc 
ctgatcccca 
cgcaggctct 
tgcacgcgga 
ttgactggag 
ggaggccgtg 
agcttgcagg 
atcagagctt 
caatcgtccg 
ccgtctggac 
ctcgtccgag 
agtttctcca 
tttcgctcat 
ttctatcaat 
cgaattaatt 
tgactgggaa 
cagctggcgt 
gaatggcgaa 

tgrgggatcct: 

gagtcacgtt 
agaaccgcaa 
acatacgtca 
aaatatttct 
agagtctcat 
cgcggccgcc 
aattcttgtt 
tgaaggtgat 
acctgttccg 
atacccagat 
gcaggagagg 
gtttgaggga 



cgcctctccc 
ggaaagcggg 
aggctttaca 
ttcacacagg 
ggtatacaga 
gaaaaaaatg 
gctgcaataa 
atcatccagc 
gtggaatcga 
ttgccggccg 
atggccggcc 
agctcgtcca 
tggaccgcgc 
acgaagtccc 
cgcgcggtga 
gttagtataa 
caagaatatc 
ggtaatatcg 
gacagtagaa 
cgttcaagat 
cgtggaaaaa 
ggtggagcac 
aagggctatt 
cccagctatc 
ccatcattgc 
agatggaccc 
aaagcaagtg 
tccttcgcaa 
aatcaccagt 
atgagatatg 
gttcgacagc 
cttcgatgta 
caaagatcgt 
tgacattggg 
cacgttgcaa 
tatggatgcg 
gcaaggaatc 
tgtgtatcac 
cgatgagctg 
tttcggctcc 
cgaggcgatg 
gttggcttgt 
atcgccacga 
ggttgacggc 
atccggagcc 
cgatggctgt 
ggc aaagaaa 
taataatgtg 
gtgttgagca 
aaaatttcta 
cggcgttaat 
aaccctggcg 
aatagcgaag 
tgctagagca 
ctagactgaa 
atgacccccg 
cgttgaagga 
gaaaccatta 
tgtcaaaaat 
attcactctc 
atggtagatc 
gaattagatg 
gcaacatacg 
tggccaacac 
catatgaagc 
accatcttct 
gacaccctcg 



cgcgcgttgg 
cagtgagcgc 
ctttatgctt 
aaacagctat 
catgataaga 
ctttatttgt 
acaagttggg 
cggcgtcccg 
aatctcgtag 
ggtcgcgcag 

cggaggcgtc 

ggccgcgcac 
tgatgaacag 
gggagaaccc 
gcaccggaac 
aaaagcaggc 
aaagat ac ag 
ggaaacctcc 
aaggaaggtg 
gcctctgccg 
gaagacgttc 
gacactctcg 
gagacttttc 
tgtcacttca 
gataaaggaa 
ccacccacga 
gattgatgtg 
gaccttcctc 
ctctctctac 
aaaaagcctg 
gtctccgacc 
ggagggcgtg 
tatgtttatc 
gagtttagcg 
gacctgcctg 
atcgctgcgg 
ggtcaataca 
tggcaaactg 
atgctttggg 
aacaatgtcc 
ttcggggatt 
atggagcagc 
ctccgggcgt 
aatttcgatg 
gggactgtcg 
gtagaagtac 
tagagtagat 
tgagtagttc 
tataagaaac 
attcctaaaa 
tcagatcaag 
ttacccaact 
aggcccgcac 
gcttgagctt 
ggcgggaaac 
ccgatgacgc 
gccactcagc 
ttgcgcgttc 
gctccactga 
aatccaaata 
tgactagtaa 
gtgatgttaa 
gaaaacttac 
ttgtcactac 
ggcacgactt 
tcaaggacga 
tcaacaggat 



ccgattcatt 
aacgcaatta 
ccggctcgta 
gaccatgatt 
tacattgatg 
gaaatttgtg 
gtgggcgaag 
gaaaacgatt 
cacgtgtcag 
ggcgaactcc 
ccggaagttc 
ccacacccag 
ggtcacgtcg 
gagccggtcg 
ggcactggtc 
ttcaatcctg 
fcctcagaaga 
tcggattcca 
gcacctacaa 
acagtggtcc 
caaccacgtc 
tctactccaa 
aacaaagggt 
tcaaaaggac 
aggctatcgt 
ggagcatcgt 
atatctccac 
tatataagga 
aaatctatct 
aactcaccgc 
tgatgcagct 
gatatgtcct 
ggcactttgc 
agagcctgac 
aaaccgaact 
ccgatcttag 
ctacatggcg 
tgatggacga 
ccgaggactg 
tgacggacaa 
cccaatacga 
agacgcgcta 
atatgctccg 
atgcagcttg 
ggcgtacaca 
tcgccgatag 
gccgaccgga 
ccagataagg 
ccttagtatg 
ccaaaatcca 
cttggcactg 
taatcgcctt 
cgatcgccct 
ggatcagatt 
gacaatctga 
gggacaagcc 
cgcgggtttc 
aaaagtcgcc 
cgttccataa 
atctgcaccg 
aggagaagaa 
tgggcacaaa 
ccttaaattt 
tttctcttat 
cttcaagagc 
cgggaactac 
cgagcttaag 



aatgcagctg 
atgtgagtta 
tgttgtgtgg 
acgaattcga 
agtttggaca 
atgctattgc 
aactccagca 
ccgaagccca 
tcctgctcct 
cgcccccacg 
gtggacacga 
gccagggtgt 
tcccggacca 
gtccagaact 
aacttggcca 
caggaattcg 
ccaaagggct 
ttgcccagct 
atgccatcat 
caaagatgga 
ttcaaagcaa 
gaatatcaaa 
aatatcggga 
agtagaaaag 
tcaagatgcc 
ggaaaaagaa 
tgacgtaagg 
agttcatttc 
ctctcgagct 
gacgtctgtc 
ctcggagggc 
gcgggt aaat 
atcggccgcg 
ctattgcatc 
gcccgctgtt 
ccagapgagc 
tgatttcata 
cac cgt c agt 
ccccgaagtc 
tggccgcata 
ggtcgccaac 
cttcgagcgg 
cattggtctt 
ggcgcagggt 
aatcgcccgc 
tggaaaccga 
tctgtcgatc 
gaattagggt 
tatttgtatt 
gtactaaaat 
gccgtcgttt 
gcagcacatc 
tcccaacagt 
gtcgtttccc 
teat gag egg 
gttttacgtt 
tggagtttaa 
taaggtcact 
attcccctcg 
gatctcgaga 
cttttcactg 
ttttctgtca 
atttgeacta 
ggtgttcaat 
gccatgcctg 
aagacacgtg 
ggaatcgatt 



4980 
5040 
5100 
5160 
5220 
5280 
5340 
5400 
5460 
5520 
5580 
5640 
5700 
5760 
5820 
5880 
5940 
6000 
6060 
6120 
6180 
6240 
6300 
6360 
6420 
6480 
6540 
6600 
6660 
6720 
6780 
6840 
6900 
6960 
702O 
7080 
7140 
7200 
7260 
7320 
7380 
7440 
7500 
7560 
7620 
7680 
7740 
7800 
7860 
7920 
7980 
8040 
8100 
8160 
8220 
8280 
8340 
8400 
8460 
8520 
8580 
8640 
8700 
8760 
8820 
8880 
8940 
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fccaaggagga 
tatacatcat 
ac at cgaaga 
atggccctgt 
atcccaacga 
cacatggcat 
gtgaccagct 
aatcctgttg 
gtaataatta 
ccgcaattat 
ttatcgcgcg 
ggatatattg 
aagggcgtga 
ccctcgggat 
acgttcagtg 
ggctgccgcc 
gaatacttgc 
gctgggctafc 
gcacgcggcc 
cccggagctg 
gctagaccgc 
ggccggcgcg 
ccgcatggtg 
ccgcacccgg 
taccctcacc 
cgtgaaagag 
gcgcagcgag 
attgaccgag 
aaaccgcacc 
atgat cgcgg 
gaaatcctgg 
gaagaaaccg 
catgcggtcg 
tgaaggttat 
atctagcccg 
agggcagtgc 
tcgaccgccc 
tcgacggagc 
tgctgattcc 
tggttaagca 
gggcgatcaa 
tgcccattct 
gcacaaccgt 
ccgctgaaat 
cacaaacacg 
cagcctggca 
caccaagctg 
atacatcgcg 
cggctaaagg 
ccatgtgtgg 
gcaatggcac 
ccggtacaaa 
gccgcccagc 
gctgatcgaa 
aagccgccca 
acccgcgata 
cgagctggcg 
ccggccggca 
accgaatcca 
ccacacgttg 
gacctggtag 



cggaaacatc 
ggccgacaag 
cggcggcgtg 
ccttttacca 
aaagagagac 
ggatgaacta 
cgaatttccc 
ccggtcttgc 
acatgtaatg 
acatttaata 
cggtgtcatc 
gcgggtaaac 
aaaggtttat 
caaagtactt 
cagccgtctt 
ctgccctttt 
gactagaacc 
gcccgcgtca 
ggctgcacca 
gccaggatgc 
ctggcccgca 
ggcctgcgta 
ttgaccgtgt 
agcgggcgcg 
ccggcacaga 
gcggctgcac 
gaagtgacgc 
gccgacgccc 
aggacggcca 
ccgggtacgt 
ccggtttgtc 
agcgccgccg 
ctgcgtatat 
cgctgtactt 
cgccctgcaa 
ccgcgattgg 
gacgattgac 
gccccaggcg 
ggt gc age c a 
gegcattgag 
aggcacgcgc 
tgagtcccgt 
tcttgaatca 
taaatcaaaa 
ctaagtgccg 
gacacgccag 
aagatgtacg 
cagctaccag 
aggeggcatg 
aggaaeggge 
tggaaccccc 
tcggcgcggc 
ggcaaegcat 
teegcaaaga 
agggegaega 
gtegcagcat 
aggtgatccg 
tggccagtgt 
tgaaccgata 
eggaegtact 
aaacctgeat 



ctcggccaca 
caaaagaacg 
caactcgctg 
gacaaccatt 
cacatggtcc 
tacaaagcta 
egategttea 
gatgattatc 
catgaegtta 
cgcgatagaa 
tatgttacta 
ctaagagaaa 
ccgttcgtcc 
tgatccaacc 
ctgaaaacga 
cctggcgttt 
ggagacatta 
gcaccgacga 
agctgttttc 
ttgaccacct 
gcacccgcga 
gectggcaga 
tcgccggcat 
aggccgccaa 
tcgcgcacgc 
tgcttggcgt 
ccaccgaggc 
tggcggccgc 
ggacgaaccg 
gttcgagccg 
tgatgccaag 
tctaaaaagg 
gatgegatga 
aaccagaaag 
ctcgccgggg 
gcggccgtgc 
cgcgacgtga 
gcggacttgg 
agcccttacg 
gtcacggatg 
ateggeggtg 
atcaegcagc 
gaacccgagg 
ctcatttgag 
gccgtccgag 
ecatgaageg 
cggtacgcca 
agtaaatgag 
gaaaatcaag 
ggttggccag 
aageccgagg 
gctgggtgat 
cgaggcagaa 
atcccggcaa 
gcaaccagat 
catggacgtg 
ctacgagctt 
gtgggattac 
ccgggaaggg 
caagttctgc 
teggttaaac 



agttggaata 
gca t caaagc 
atcattatca 
acctgtccac 
ttcttgagtt 
gccaccacca 
aacatttggc 
atataatttc 
tttatgagat 
aacaaaatat 
gategggaat 
agagegttta 
atttgtatgt 
cctccgctgc 
catgtcgcac 
tcttgtcgcg 
cgccatgaac 
ccaggacttg 
cgagaagatc 
acgccctggc 
cctactggac 
gccgtgggcc 
tgccgagttc 
ggcccgaggc 
ccgcgagctg 
gcatcgctcg 
caggcggcgc 
cgagaatgaa 
tttttcatta 
cccgcgcacg 
ctggcggcct 
tgatgtgtat 
gtaaataaac 

gegggtcagg 

ccgatgttct 
gggaagatca 
aggecategg 
ctgtgtccgc 
acatatgggc 
gaaggctaca 
aggttgeega 
gcgtgagcta 
gcgacgctgc 
ttaatgaggt 
cgcacgcagc 
ggtcaacttt 
aggcaagacc 
caaatgaata 
aacaaccagg 
gegtaagegg 
aatcggcgtg 
gacctggtgg 
gcacgccccg 
ccgccggcag 
tttttcgttc 
gccgttttcc 
ccagacgggc 
gacctggtac 
aagggagaca 
cggcgagccg 
accacgcacg 



caactacaac 
caacttcaag 
acaaaatact 
acaatctgcc 
tgtaacagct 
ccaccaccac 
aataaagttt 
tgttgaatta 
gggtttttat 
agegegcaaa 
taaactatca 
ttagaataac 
gcatgccaac 
tatagtgcag 
aagtcctaag 
tgttttagtc 
aagagcgccg 
accaaccaac 
accggcacca 
gacgttgtga 
attgecgage 
gacaccacca 
gagcgttccc 
gtgaagtttg 
atcgaccagg 
accctgtacc 
ggtgccttcc 
cgecaagagg 
ccgaagagat 
tctcaaccgt 
ggccggccag 
1 1 gag t aaaa 
aaatacgcaa 
caagacgacc 
gttagtcgat 
accgctaacc 
ccggcgcgac 
gatcaaggca 
caccgccgac 
ageggecttt 
ggcgctggcc 
cccaggcact 
ccgcgaggtc 
aaagagaaaa 
agcaaggctg 
cagttgeegg 
attaccgagc 
aatgagtaga 
caccgacgcc 
cfcgggttgtc 
aeggtegcaa 
agaagttgaa 
gtgaatcgtg 
ccggtgcgcc 
egatgetcta 
gtctgtcgaa 
aegtagaggt 
tgatggcggt 
agcccggccg 
atggcggaaa 
ttgecatgea 



tcccacaacg 
acccgccaca 
ccaattggcg 
ctttcgaaag 
gctgggatta 
gtgtgaattg 
cttaagattg 
cgttaagcat 
gattagagtc 
ctaggataaa 
gtgtttgaca 
ggatatttaa 
cacagggttc 
teggcttctg 
ttacgegaca 
gcataaagta 
ccgctggcct 
gggecgaact 
ggcgcgaccg 
cagtgaccag 
gcatccagga 
cgccggccgg 
taatcatcga 
gcccccgccc 
aaggccgcac 
gcgcacttga 
gtgaggaege 
aacaagcatg 
egaggeggag 
geggctgeat 
cttggccgct 
cagcttgegt 
ggggaacgea 
atcgcaaccc 
tccgatcccc 
gttgtcggca 
ttcgtagtga 
gccgacttcg 
ctggtggagc 
gtcgtgtcgc 
ggg t acgagc 
gccgccgccg 
caggege t gg 
tgagcaaaag 
caacgttggc 
eggaggatea 
tgctatctga 
tgaattttag 
gtggaatgcc 
tgccggccct 
accatccggc 
ggccgcgcag 
gcaagcggcc 
gtcgattagg 
tgacgtgggc 
gcgtgaccga 
ttccgcaggg 
ttcccatcta 
cgtgttccgt 
gcagaaagac 



9000 

9060 

9120 

9180 

9240 

9300 

9360 

9420 

9480 

9540 

9600 

9660 

9720 

9780 

9840 

9900 

9960 

10020 

10080 

10140 

10200 

10260 

10320 

10380 

10440 

10500 

10560 

10620 

10680 

10740 

10800 

10860 

10920 

10980 

11040 

11100 

11160 

11220 

11280 

11340 

11400 

11460 

11520 

11580 

11640 

11700 

11760 

11820 

11880 

11940 

12000 

12060 

12120 

12180 

12240 

12300 

12360 

12420 

12480 

12540 

12592 



<210> 96 
<211> 3357 
<212> DNA 

<213> Artificial Sequence 



-57- 



<220> 

<223> pGEMEasyNOS Plasmid 



<400> 96 

tatcactagt 

tggatgcata 

tagctgtttc 

agcataaagt 

cgctcactgc 

caacgcgcgg 

tcgctgcgct 

cggttatcca 

aaggccagga 

gacgagcatc 

agataccagg 

cttaccggat 

cgctgtaggt 

ccccccgttc 

gtaagacacg 

tatgtaggcg 

acagtatttg 

tcfctgafcccg 

attacgcgca 

gctcagtgga 

ttcacctaga 

taaacttggt 

ctatttcgtt 

ggcttaccat 

gatttatcag 

ttatccgcct 

gttaatagtt 

fcfcfcggfcatgg 

atgttgtgca 

gccgcagtgt 

tccgtaagat 

atgcggcga c 

agaactttaa 

ttaccgctgt 

tcttttactt 

aagggaataa 

tgaagcattt 

aataaacaaa 

aataccgcac 

ttgttaaaat 

atcggcaaaa 

gtttggaaca 

gtctatcagg 

aggtgccgta 

ggaaagccgg 

gcgctggcaa 

ccgctacagg 

tgcgggcctc 

gttgggtaac 

aatacgactc 

gccgcgggaa 

gactcfcaatfc 

atatttgcta 

gtatgtgctt 

ggttctgtca 

tgactccctt 



gaattcgcgg 
gcttgagtat 
ctgtgtgaaa 
gtaaagcctg 
ccgctttcca 
ggagaggcgg 
cggtcgttcg 
cagaatcagg 
accgtaaaaa 
acaaaaatcg 
cgtttccccc 
acctgtccgc 
atctcagttc 
agcccgaccg 
acttatcgcc 
gtgctacaga 
gtatctgcgc 
gcaaacaaac 
gaaaaaaagg 
acgaaaactc 
tccttttaaa 
ctgacagtta 
catccatagt 
ctggccccag 
caataaacca 
ccatccagtc 
tgcgcaacgt 
cttcattcag 
aaaaagcggt 
tatcactcat 
gcttttctgt 
cgagttgctc 
aagtgctcat 
tgagatccag 
tcaccagcgt 
gggcgacacg 
atcagggtta 
taggggttcc 
agatgcgtaa 
tcgcgttaaa 
tcccttataa 
agagtccact 
gcgatggccc 
aagcactaaa 
cgaacgtggc 
gtgtagcggt 
gcgcgtccat 
ttcgctatta 
gccagggttt 
a c t a t agggc 
ttcgattctc 
ggataccgag 
gcfcgatagtg 
agctcatfcaa 
gttccaaacg 
aattctccgc 



ccgcctgcag 
tctatagtgt 
ttgttatccg 

gggtgcctaa 

gtcgggaaac 
tttgcgtatt 
gctgcggcga 
ggataacgca 
ggccgcgttg 
acgctcaagt 
tggaagctcc 
ctttctccct 
ggtgtaggtc 
ctgcgcctta 
actggcagca 
gttcttgaag 
tctgctgaag 
caccgctggt 
atctcaagaa 
acgttaaggg 
ttaaaaatga 
ccaatgctta 
tgcctgactc 
tgctgcaatg 
gccagccgga 
tattaattgt 
tgttgccatt 
ctccggttcc 
tagctccttc 
ggttatggca 
gactggtgag 
ttgcccggcg 
catfcggaaaa 
ttcgatgtaa 
ttctgggtga 
gaaatgttga 
ttgtctcatg 
gcgcacattt 
ggagaaaata 
tttttgttaa 
afccaaaagaa 
attaaagaac 
actacgtgaa 
tcggaaccct 
gagaaaggaa 
cacgctgcgc 
tcgccattca 
cgccagctgg 
tcccagtcac 
gaattgggcc 
gagatccggt 
gggaatttat 
accttaggcg 
actccagaaa 
taaaacggct 
tcatgatcag 



gtcgaccata 
cacctaaata 
ctcacaattc 
tgagtgagct 
ctgtcgtgcc 
gggcgctcfct 
gcggtatcag 
ggaaagaaca 
ctggcgtttt 
cagaggtggc 
ctcgtgcgct 
tcgggaagcg 
gttcgctcca 
tccggtaact 
gccactggta 
tggtggccta 
ccagttacct 
agcggtggtt 
gatcctttga 
attttggtca 
agttttaaat 
atcagtgagg 
cccgtcgtgt 
ataccgcgag 
agggc cgagc 
tgccgggaag 
gctacaggca 
caacgatcaa 
ggtccfcccga 
gcactgcata 
tactcaacca 
tcaatacggg 
cgttcttcgg 
cccactcgtg 
gcaaaaacag 
atactcatac 
agcggataca 
ccccgaaaag 
ccgcatcagg 
atcagctcat 
tagaccgaga 
gtggactcca 
ccatcaccct 
aaagggagcc 
gggaagaaag 
gtaaccacca 
ggctgcgcaa 
cgaaaggggg 
gacgttgtaa 
cgacgtcgca 
gcagattatt 
ggaacgtcag 
acttttgaac 
cccgcggctg 
tgtcccgcgt 
attgtcgttt 



tgggagagct 
gcttggcgta 
cacacaacat 
aacfccacatt 
agctgcatta 
ccgcttcctc 
ctcactcaaa 
tgtgagcaaa 
tccataggct 
gaaacccgac 
ctcctgttcc 
tggcgctttc 
agctgggctg 
atcgtcttga 
acaggattag 
actacggcta 
t cggaaaaag 
tttttgtttg 
tcttttctac 
tgagattatc 
caatctaaag 
cacctatctc 
agataactac 
acccacgctc 
gcagaagtgg 
ctagagtaag 
tcgtggtgtc 
ggcgagtfcac 
tcgttgtcag 
attctcttac 
agtcattctg 
ataataccgc 
ggcgaaaact 
cacccaactg 
gaaggcaaaa 
tcttcctttt 
tatttgaatg 
tgccacctga 
aaattgtaag 
tttttaacca 
tagggttgag 
acgtcaaagg 
aatcaagttt 
cccgatttag 
cgaaaggagc 
cacccgccgc 
ctgttgggaa 
atgtgctgca 
aacgacggcc 
tgctcccggc 
tggattgaga 
tggagcattt 
gcgcaataat 
agtggctcct 
catcggcggg 
cccgccttca 



cccaacgcgt 
afccatggtca 
acgagccgga 
aattgcgttg 
atgaatcggc 
gctcactgac 

ggcggtaata 

aggccagcaa 
ccgcccccct 
aggactataa 
gaccctgccg 
tcatagctca 
tgtgcacgaa 
gtccaacccg 
cagagcgagg 
cactagaaga 
agttggtagc 
caagcagcag 
ggggtctgac 
aaaaaggatc 
tatatatgag 
agcgatctgt 
gatacgggag 
accggc t cca 
tcctgcaact 
tagttcgcca 
acgctcgtcg 
atgatccccc 
aagtaagttg 
tgtcatgcca 
agaatagtgt 
gccacatagc 
ctcaaggatc 
atcttcagca 
tgccgcaaaa 
tcaatattat 
tatttagaaa 
tgcggtgtga 
cgttaatatt 
ataggccgaa 
tgttgttcca 
gcgaaaaacc 
tttggggtcg 
agctfcgacgg 
gggcgctagg 
gcttaatgcg 
gggcgatcgg 
aggcgattaa 
agtgaattgt 
cgccatggcg 
gtgaatatga 
ttgacaagaa 
ggtttctgac 
tcaacgttgc 
ggtcataacg 
gtctaga 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3357 



<210> 97 
<211> 10122 
<212> DNA 

<213> Artificial Sequence 



<220> 



-58- 



<223> pl302NOS Plasmid 



<400> 97 

catggtagat 

tgaattagat 

tgcaacatac 

gtggccaaca 

tcatatgaag 

gaccatcttc 

agacaccctc 

cctcggccac 

gcaaaagaac 

gcaactcgct 

agacaaccat 

ccacatggtc 

atacaaagct 

ccgatcgttc 

cgatgattat 

gcatgacgtt 

acgcgafcaga 

ctatgttact 

cctaagagaa 

tccgttcgtc 

ttgatccaac 

tctgaaaacg 

tcctggcgtt 

cggagacat t 

agcaccgacg 

aagctgtttt 

cttgaccacc 

agcacccgcg 

agcctggcag 

ttcgccggca 

gaggccgcca 

atcgcgcacg 

ctgcttggcg 

cccaccgagg 

ctggcggccg 

aggacgaacc 

tgttcgagcc 

ctgatgccaa 

gtctaaaaag 

tgatgcgatg 

taaccagaaa 

actcgccggg 

ggcggccgtg 

ccgcgacgtg 

ggcggacttg 

aagcccttac 

ggtcacggat 

catcggcggt 

tatcacgcag 

agaacccgag 

actcatttga 

ggccgtccga 

gccafcgaagc 

gcggtacgcc 

gagtaaatga 

ggaaaatcaa 

cggttggcca 

caagcccgag 

cgctgggtga 

tcgaggcaga 

aatcccggca 

agcaaccaga 

tcatggacgt 

gctacgagct 



ctgactagta 
ggtgatgtta 
ggaaaactta 
cttgtcacta 
cggcacgact 
ttcaaggacg 
gtcaacagga 
aagttggaat 
ggcatcaaag 
gatcattatc 
tacctgtcca 
cttcttgagt 
agccaccacc 
aaacatttgg 
catataattt 
atttatgaga 
aaacaaaata 
agatcgggaa 
aagagcgttt 
catttgtatg 
ccctccgctg 
acatgtcgca 
ttcttgtcgc 
acgccatgaa 
accaggactt 
ccgagaagat 
tacgccctgg 
acctactgga 
agccgtgggc 
ttgccgagtt 
aggcccgagg 
cccgcgagct 
tgcatcgctc 
ccaggcggcg 
ccgagaatga 
gtttfctcatt 
gcccgcgcac 
gctggcggcc 
gtgatgtgta 
agtaaataaa 
ggcgggtcag 
gccgatgttc 
cgggaagatc 
aaggccatcg 
gctgtgtccg 
gacatatggg 
ggaaggctac 
gaggttgccg 
cgcgtgagct 
ggcgacgctg 
gttaatgagg 
gcgcacgcag 
gggtcaactt 
aaggcaagac 
gcaaatgaat 
gaacaaccag 
ggcgtaagcg 
gaatcggcgt 
tgaccfcggtg 
agcacgcccc 
accgccggca 
fctttttcgtt 
ggccgttttc 
tccagacggg 



aaggagaaga 
atgggcacaa 
cccttaaatt 
ctttctctta 
tcttcaagag 
acgggaacta 
tcgagcttaa 
acaactacaa 
ccaacttcaa 
aacaaaatac 
cacaatctgc 
tfcgtaacagc 
accaccacca 
caataaagtt 
ctgttgaatt 
tgggfcfcfctfca 
tagcgcgcaa 
ttaaactatc 
attagaataa 
tgcatgccaa 
ctatagtgca 
caagtcctaa 
gtgttttagt 
caagagcgcc 
gaccaaccaa 
caccggcacc 
cgacgttgtg 
cattgccgag 
cgacaccacc 
cgagcgttcc 
cgtgaagttt 
gatcgaccag 
gaccctgtac 
cggtgccttc 
acgccaagag 
accgaagaga 
gtctcaaccg 
tggccggcca 
tttgagtaaa 
caaatacgca 
gcaagacgac 
tgttagtcga 
aaccgctaac 
gccggcgcga 
cgatcaaggc 
ccaccgccga 
aagcggcctt 
aggcgctggc 
acccaggcac 
cccgcgaggt 
taaagagaaa 
cagcaaggct 
tcagttgccg 
cattaccgag 
aaatgagtag 
gcaccgacgc 
gctgggttgt 
gacggtcgca 
gagaagttga 
ggtgaatcgt 
gccggtgcgc 
ccgatgctct 
cgtctgtcga 
cacgtagagg 



acttttcact 
attttctgtc 
tatttgcact 
fcggfcgtfccaa 
cgccatgcct 
caagacacgt 
gggaatcgat 
ctcccacaac 
gacccgccac 
tccaattggc 
cctttcgaaa 
tgctgggatt 
cgtgtgaatt 
tcttaagatt 
acgttaagca 
tgattagagt 
actaggataa 
agtgtttgac 
cggatattta 
ccacagggtt 
gtcggcttct 
gttacgcgac 
cgcataaagt 
gccgctggcc 
cgggccgaac 
aggcgcgacc 
acagtgacca 
cgcatccagg 
acgccggccg 
ctaatcatcg 
ggcccccgcc 
gaaggccgca 
cgcgcacttg 
cgtgaggacg 
gaacaagcat 
tcgaggcgga 
tgcggctgca 
gcttggccgc 
acagcttgcg 
aggggaacgc 
catcgcaacc 
ttccgatccc 
cgttgtcggc 
cttcgtagtg 
agccgacttc 
ccfcggtggag 
tgtcgtgtcg 
cgggtacgag 
tgccgccgcc 
ccaggcgctg 
atgagcaaaa 
gcaacgttgg 
gcggaggatc 
ctgctatctg 
atgaatttta 
cgtggaatgc 
ctgccggccc 
aaccatccgg 
aggccgcgca 
ggcaagcggc 
cgtcgattag 
atgacgtggg 
agcgtgaccg 
tttccgcagg 



ggagttgtcc 
agtggagagg 
actggaaaac 
tgcttttcaa 
gagggafcacg 
gctgaagtca 
ttcaaggagg 
gt atacatca 
aacatcgaag 
gatggccctg 
gatcccaacg 
acacatggca 
ggtgaccagc 
gaatcctgtt 
tgtaataatt' 
cccgcaatta 
attatcgcgc 
aggatatatt 
aaagggcgtg 
cccctcggga 
gacgttcagt 
aggctgccgc 
agaatacttg 
tgctgggcta 
tgcacgcggc 
gcccggagct 
ggctagaccg 
aggccggcgc 
gccgcatggt 
accgcacccg 
ctaccctcac 
ccgtgaaaga 
agcgcagcga 
cattgaccga 
gaaaccgcac 
gatgat cgcg 
tgaaatcctg 
tgaagaaacc 
tcatgcggtc 
atgaaggtta 
catctagccc 
cagggcagtg 
atcgaccgcc 
atcgacggag 
gtgctgattc 
ctggttaagc 
cgggcga t ca 
ctgcccattc 
ggcacaaccg 
gccgctgaaa 
gcacaaacac 
ccagcctggc 
acaccaagct 
aatacatcgc 
gcggctaaag 
cccafcgtgtg 
tgcaatggca 
cccggtacaa 
ggccgcccag 
cgctgatcga 
gaagccgccc 
cacccgcgat 
acgagctggc 
gccggccggc 



caattcttgt 
gtgaaggtga 
tacctgttcc 
gatacccaga 
tgcaggagag 
agtttgaggg 
acggaaacat 
tggccgacaa 
acggcggcgt 
tccttttacc 
aaaagagaga 
tggatgaact 
tcgaatttcc 
gccggtcttg 
aacatgtaat 
tacatttaat 
gcggtgtcat 
ggcgggtaaa 
aaaaggttta 
tcaaagtact 
gcagccgtct 
cctgcccttt 
cgactagaac 
tgcccgcgtc 
cggctgcacc 
ggccaggatg 
cctggcccgc 
gggcctgcgt 
gttgaccgtg 
gagcgggcgc 
cccggcacag 
ggcggctgca 
ggaagtgacg 
ggccgacgcc 
caggacggcc 
gccgggtacg 
gccggtttgt 
gagcgccgcc 
gctgcgtata 
tcgctgtact 
gcgccctgca 
cccgcgattg 
cgacgattga 
cgccccaggc 
cggtgcagcc 
agcgcattga 
aaggcacgcg 
ttgagtcccg 
ttcttgaatc 
ttaaatcaaa 
gctaagtgcc 
agacacgcca 
cfaagatgtac 
gcagctacca 
gaggcggcat 
gaggaacggg 
ctggaacccc 
atcggcgcgg 
cggcaacgca 
atccgcaaag 
aagggcgacg 
agtcgcagca 
gaggtgatcc 
atggccagtg 



60 

12 0 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 

3840 



-59- 

tgtgggafcfca cgacctggta ctgatggcgg tttcccatct aaccgaatcc atgaaccgat 390 0 
accgggaagg gaagggagac aagcccggcc gcgtgttccg tccacacgtt gcggacgtac 3960 
tcaagtfcctg ccggcgagcc gatggcggaa agcagaaaga cgacctggta gaaacctgca 402 0 
ttcggttaaa caccacgcac gttgccatgc agcgtacgaa gaaggccaag aacggccgcc 4080 
tggtgacggfc atccgagggt gaagccttga ttagccgcta caagatcgta aagagcgaaa 414 0 
ccgggcggcc ggagtacatc gagatcgagc tagctgattg gatgtaccgc gagatcacag 42 0 0 
aaggcaagaa cccggacgtg ctgacggttc accccgatta ctttttgatc gatcccggca 42 60 
tcggccgttt tctctaccgc ctggcacgcc gcgccgcagg caaggcagaa gccagatggt 432 0 
tgttcaagac gatctacgaa cgcagtggca gcgccggaga gttcaagaag ttctgtttca 43 8 0 
ccgtgcgcaa gctgatcggg tcaaatgacc tgccggagta cgatttgaag gaggaggcgg 4440 
ggcaggctgg cccgatccta gtcatgcgct accgcaacct gatcgagggc gaagcatccg 4 500 
ccggttccta atgtacggag cagahgctag ggcaaattgc cctagcaggg gaaaaaggtc 456 0 
gaaaaggtct ctttcctgtg gatagcacgt acattgggaa cccaaagccg tacattggga 462 0 
accggaaccc gtacattggg aacccaaagc cgtacattgg gaaccggtca cacatgtaag 4680 
tgactgatat aaaagagaaa aaaggcgatt tttccgccta aaactcttta aaacttatta 4 74 0 
aaactcttaa aacccgcctg gcctgtgcat aactgtctgg ccagcgcaca gccgaagagc 4800 
tgcaaaaagc gcctaccctt cggtcgctgc gctccctacg ccccgccgcfc tcgcgtcggc 4860 
ctatcgcggc cgctggccgc tcaaaaatgg ctggcctacg gccaggcaat ctaccagggc 492 0 
gcggacaagc cgcgccgtcg ccactcgacc gccggcgccc acatcaaggc accctgcctc 4 980 
gcgcgtttcg gtgatgacgg tgaaaacctc tgacacatgc agctcccgga gacggtcaca 504 0 
gcttgtctgt aagcggatgc cgggagcaga caagcccgtc agggcgcgtc agcgggtgtt 5100 
ggcgggtgtc ggggcgcagc catgacccag tcacgtagcg atagcggagt gtatactggc 516 0 
ttaactatgc ggcatcagag cagattgtac tgagagtgca ccatatgcgg tgtgaaatac 522 0 
cgcacagatg cgtaaggaga aaataccgca tcaggcgctc ttccgcttcc tcgctcactg 5280 
actcgctgcg ctcggtcgtt cggctgcggc gagcggtatc agctcactca aaggcggtaa 5340 
tacggttatc cacagaatca ggggataacg caggaaagaa catgtgagca aaaggccagc 54 00 
aaaaggccag gaaccgtaaa aaggccgcgt tgctggcgtt tttccatagg ctccgccccc 54 6 O 
ctgacgagca tcacaaaaat cgacgctcaa gtcagaggtg gcgaaacccg acaggactat 55 2 O 
aaagatacca ggcgtttccc cctggaagct ccctcgtgcg ctctcctgtt ccgaccctgc 55 80 
cgcttaccgg atacctgtcc gcctttctcc cttcgggaag cgtggcgctt tctcatagct 5640 
cacgctgtag gtatctcagt tcggtgtagg tcgttcgctc caagctgggc tgtgtgcacg 57 OO 
aaccccccgt tcagcccgac cgctgcgcct tatccggtaa ctatcgtctt gagtccaacc 57 6 0 
cggtaagaca cgacttatcg ccactggcag cagccactgg taacaggatt agcagagcga 582 0 
ggtafcgtagg cggtgctaca gagttcfctga agtggtggcc taactacggc tacactagaa 58 8 0 
ggacagtatt tggtatctgc gctctgctga agccagttac cttcggaaaa agagttggta 594 0 
gctcttgatc cggcaaacaa accaccgctg gtagcggtgg tttttttgtt tgcaagcagc 60 0 0 
agattacgcg cagaaaaaaa ggatctcaag aagatccttt gatcttttct acggggtctg 6060 
acgctcagtg gaacgaaaac tcacgttaag ggattttggt catgcattct aggtactaaa 6120 
acaattcatc cagtaaaata taatatttta ttttctccca atcaggcttg atccccagta 6180 
agtcaaaaaa tagctcgaca tactgttctt ccccgatatc ctccctgatc gaccggacgc 624 0 
agaaggcaat gtcataccac ttgtccgccc tgccgcttct cccaagatca ataaagccac 63 00 
ttactttgcc atctttcaca aagatgttgc tgtctcccag gtcgccgtgg gaaaagacaa 63 6 0 
gttcctcttc gggcttttcc gtctttaaaa aatcatacag ctcgcgcgga tctttaaatg 6420 
gagtgtcttc ttcccagttt tcgcaatcca catcggccag atcgttattc agtaagtaat 64 80 
ccaattcggc taagcggctg tctaagctat tcgtataggg acaatccgat atgtcgatgg 6540 
agtgaaagag cctgatgcac tccgcafcaca gctcgataat cttttcaggg ctttgttcat 6600 
cttcatactc ttccgagcaa aggacgccat cggcctcact catgagcaga ttgctccagc 6660 
catcatgccg ttcaaagtgc aggacctttg gaacaggcag ctttccttcc agccatagca 672 0 
tcatgtcctt ttcccgttcc acatcatagg tggtcccttt ataccggctg tccgtcattt 6780 
ttaaatatag gttttcattt tctcccacca gcttatatac cttagcagga gacattcctt 684 0 
ccgtatcttt tacgcagcgg tatttttcga tcagtttttt caattccggt gatattctca 6900 
ttttagccat ttattatttc cttcctcttt tctacagtat ttaaagatac cccaagaagc 6960 
taattataac aagacgaact ccaattcact gttccttgca ttctaaaacc ttaaatacca 702 0 
gaaaacagct ttttcaaagt tgttttcaaa gttggcgtat aacatagtat cgacggagcc 7080 
gattttgaaa ccgcggtgat cacaggcagc aacgctctgt catcgttaca atcaacatgc 714 0 
taccctccgc gagatcatcc gtgtttcaaa cccggcagct tagttgccgt tcttccgaat 72 OO 
agcatcggta acatgagcaa agtctgccgc cttacaacgg ctctcccgct gacgccgtcc 72 60 
cggactgatg ggctgccfcgfc atcgagtggt gattttgtgc cgagctgccg gtcggggagc 7320 
tgttggctgg ctggtggcag gatatattgt ggtgtaaaca aattgacgct tagacaactt 73 80 
aataacacat tgcggacgtt tttaatgtac tgaattaacg ccgaattaat tcgggggatc 744 0 
tggattttag tactggattt tggttttagg aattagaaat tttattgata gaagtatttt 75 00 
acaaatacaa atacatacta agggtttctt atatgctcaa cacatgagcg aaaccctata 7560 
ggaaccctaa ttcccttatc fcgggaactac tcacacatta ttatggagaa actcgagctt 762 0 
gtcgatcgac agatccggtc ggcatctact ctatttcttt gccctcggac gagtgctggg 76 8 0 
gcgtcggttt ccactatcgg cgagtacttc tacacagcca tcggtccaga cggccgcgct 774 0 
tctgcgggcg atttgtgtac gcccgacagt cccggctccg gatcggacga ttgcgtcgca 7 80 0 
tcgaccctgc gcccaagctg catcatcgaa attgccgtca accaagctct gatagagttg 7 86 0 
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gtcaagacca atgcggagca tatacgcccg gagtcgtggc gatcctgcaa gctccggatg 792 0 
cctccgctcg aagtagcgcg tctgctgctc catacaagcc aaccacggcc tccagaagaa 798 0 
gatgttggcg acctcgtatt gggaatcccc gaacatcgcc tcgctccagt caatgaccgc 804 0 
tgttatgcgg ccattgtccg tcaggacatt gttggagccg aaatccgcgt gcacgaggtg 810 0 
ccggacttcg gggcagtcct cggcccaaag cafccagcfcca tcgagagcct gcgcgacgga 8160 
cgcactgacg gtgtcgtcca tcacagtttg ccagfcgafcac acatggggat cagcaatcgc 822 0 
gcatatgaaa fccacgccatg tagtgtattg accgafctccfc tgcggtccga atgggccgaa 82 80 
cccgctcgtc tggctaagat cggccgcagc gatcgcatcc atagcctccg cgaccggttg 834 0 
tagaacagcg ggcagttcgg tttcaggcag gtcttgcaac gtgacaccct gtgcacggcg 84 0 0 
ggagatgcaa taggtcaggc tctcgctaaa ctccccaatg tcaagcacfct ccggaatcgg 8460 
gagcgcggcc gatgcaaagt gccgataaac ataacgatct ttgtagaaac catcggcgca 852 0 
gctatttacc cgcaggacat atccacgccc tcctacatcg aagctgaaag cacgagattc 85 8 0 
ttcgccctcc gagagctgca tcaggtcgga gacgctgtcg aacttttcga tcagaaactt 864 0 
ctcgacagac gtcgcggtga gttcaggctt tttcatatct cattgccccc ccggatctgc 8700 
gaaagctcga gagagataga tttgtagaga gagactggtg atttcagcgt gtcctctcca 876 0 
aatgaaatga acttccttat afcagaggaag gtcttgcgaa ggatagfcggg attgtgcgtc 882 0 
atcccttacg tcagtggaga tatcacatca atccacttgc tttgaagacg tggtfcggaac 8 880 
gtcttctttt tccacgatgc tcctcgtggg tgggggtcca tctttgggac cactgtcggc 894 0 
agaggcatct tgaacgafcag cctttccttt atcgcaatga tggcatttgt aggtgccacc 90 0 0 
ttccttttct acfcgtccfctt tgatgaagtg acagatagct gggcaafcgga atccgaggag 90 6 0 
gtttcccgat attacccttt gttgaaaagt ctcaatagcc ctttggfccfct ctgagactgt 9120 
atctttgata ttcttggagt agacgagagt gtcgtgctcc accatgttat cacatcaatc 9180 
cacttgcttt gaagacgtgg ttggaacgtc ttctttttcc acgatgctcc tcgtgggtgg 924 0 
gggtccatct ttgggaccac tgtcggcaga ggcatcttga acgatagccfc ttcctttatc 93 00 
gcaatgatgg catttgtagg tgccaccttc cttttctact gtccttttga tgaagtgaca 93 6 0 
gatagctggg caatggaatc cgaggaggtt tcccgatatt accctttgtt gaaaagtctc 942 0 
aatagccctt tggtcttctg agactgtatc tttgatattc ttggagtaga cgagagtgtc 9480 
gtgctccacc atgttggcaa gctgctctag ccaatacgca aaccgcctct ccccgcgcgt 9540 
tggccgattc attaatgcag ctggcacgac aggtttcccg actggaaagc gggcagtgag 9600 
cgcaacgcaa ttaatgtgag ttagctcact cattaggcac cccaggcttt acactttatg 96 6 0 
cttccggctc gfcatgfcfcgtg tggaatfcgtg agcggataac aatttcacac aggaaacagc 972 0 
tatgaccatg attacgaatt cgagctcggt acccggggat cctctagact gaaggcggga 978 0 
aacgacaatc tgatcatgag cggagaatta agggagtcac gttatgaccc ccgccgatga 9840 
cgcgggacaa gccgttfcfcac gtttggaact gacagaaccg caacgttgaa ggagccactc 9900 
agccgcgggt ttctggagtfc taatgagcta agcacatacg tcagaaacca ttattgcgcg 996 O 
ttcaaaagtc gcctaaggtc actatcagct agcaaatatt tcttgtcaaa aatgctccac 1O020 
tgacgttcca taaattcecc tcggtatcca attagagtct catattcact ctcaatccaa 10080 
ataatctgca ccggatctcg agaatcgaat tcccgcggcc gc 10122 

<210> 98 

<211> 621 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> N . tabacum rDNA intergnic spacer (IGS) sequence 
<300> 

<3 08> Genbank #Y0 8422 

<309> 1997-10-31 



gtgctagcca atgtttaaca agatgtcaag cacaatgaat gttggtggfct ggtggtcgtg 60 

gctggcggtg gtggaaaatfc gcggtggttc gagcggtagt gatcggcgat ggttggtgtt 12 0 

trgcagcggtg tttgatatcg gaatcactta tggtggttgt cacaatggag gtgcgtcatg 180 

gttatfcggtg gttggtcatc tatatatttt tataataata ttaagtattt tacctatttt 24 0 

ttacatattt tttattaaat ttatgcattg tttgtatfctt taaatagttt ttatcgtact 300 

tgttttataa aatattttat tattttatgt gttatattat tacttgatgt attggaaatt 360 

ttctccattg ttttttctat atttataata attttcttat ttttttfctgt tttattatgt 420 

attttttcgt tttataataa atatttatta aaaaaaatat tatttttgta aaatatatca 4 80 

tttacaatgt ttaaaagtca tttgtgaata tattagctaa gttgtacttc tttttgtgca 540 

tttggtgttg tacatgtcta ttatgattct ctggccaaaa catgtctact cctgtcactt 600 



<400> 



98 



gggttttfctt ttttaagaca t 



621 



<210> 99 
<211> 25 
<212> DNA 
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<213> Artificial Sequence 
<220> 

<223> NTIGS-Fl Primer 
<4D0> 99 

gtgctagcca atgtttaaca agatg 2 5 

<210> 100 

<211> 28 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> NTIGS-R1 Primer 



<400> 100 

atgtcttaaa aaaaaaaacc caagtgac 2 8 

<210> 101 

<211> 233 

<212> DNA 

<213> Mus Mus cuius 



<300> 

<308> Genbank #V00846 
<309> 1989-07-06 

<400> 101 

gacctggaat atggcgagaa aactgaaaat cacggaaaat gagaaataca cactttagga 60 

cgtgaaatat ggcgaggaaa actgaaaaag gtggaaaatt tagaaatgtc cactgtagga 12 0 

cgtggaatat ggcaagaaaa ctgaaaatca tggaaaatga gaaacatcca cttgacgact 180 

tgaaaaatga cgaaatcact aaaaaacgtg aaaaatgaga aatgcacact gaa 233 

<210> 102 
<211> 31 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> MSAT-F1 Primer 
<400> 102 

aataccgcgg aagcttgacc tggaatatcg c 

<210> 103 
<211> 27 
<212> DNA 

<213> Artificial Sequence 



31 



<220> 

<223> MSAT-Ri Primer 



<400> 103 

ataaccgcgg agtccttcag tgtgcat 27 

<210> 104 
<211> 277 
<212> DNA 

<213> Artificial Sequence 
<220> 

<2 23> Nopaline Synthase Promoter Sequence 
<300> 

<3 0 8> Genbank #U0 93 65 
<309> 1997-10-17 
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<400> 104 

gagctcgaat ttccccgatc gttcaaacat ttggcaataa agtttcfctaa gattgaatcc 60 

tgttgccggt cttgcgatga ttatcatata atttctgttg aattacgtta agcatgtaat 12 0 

aattaacatg taatgcatga cgttatttat gagatgggtt tttatgatta gagtcccgca 18 0 

attatacatt taatacgcga tagaaaacaa aatatagcgc gcaaactagg ataaattatc' 240 

gcgcgcggtg tcatctatgt tactagatcg ggaattc 277 

<210> 105 

<211> 1812 

<212> DNA 

<213> Escherichia coli 

<220> 
<221> CDS 

<222> (1) . . . (1B12) 

<223> Beta-Glucuronidase 

<300> 

<308> Genbank #S69414 
<309> 1994-09-23 

<400> 105 

atg tta cgt cct gta gaa acc cca acc cgt gaa ate aaa aaa etc gac 48 

Met Leu Arg Pro Val Glu Thr Pro Thr Arg Glu lie Lys Lys Leu Asp 
15 10 15 

ggc ctg tgg gca ttc agt ctg gat cgc gaa aac tgt gga att gat cag 96 
Gly Leu Trp Ala Phe Ser Leu Asp Arg Glu Asn Cys Gly lie Asp Gin 
20 25 30 

cgt tgg tgg gaa age gcg tta caa gaa age egg gca att get gtg cca 144 
Arg Trp Trp Glu Ser Ala Leu Gin Glu Ser Arg Ala lie Ala Val Pro 
35 40 45 

ggc agt ttt aac gat cag ttc gec gat gca gat att cgt aat tat gcg * 192 
Gly Ser Phe Asn Asp Gin Phe Ala Asp Ala Asp lie Arg Asn Tyr Ala 
50 55 60 

ggc aac gtc tgg tat cag cgc gaa gtc ttt ata ccg aaa ggt tgg gca 240 
Gly Asn Val Trp Tyr. Gin Arg Glu Val Phe lie Pro Lys Gly Trp Ala 
65 70 75 80 

ggc cag cgt ate gtg ctg cgt ttc gat gcg gtc act cat tac ggc aaa 288 
Gly Gin Arg lie Val Leu Arg Phe Asp Ala Val Thr His Tyr Gly Lys 
85 90 95 

gtg tgg gtc aat aat cag gaa gtg atg gag cat cag ggc ggc tat acg 33 6 
Val Trp Val Asn Asn Gin Glu Val Met Glu His Gin Gly Gly Tyr Thr 
100 105 110 

cca ttt gaa gec gat gtc acg ccg tat gtt att gee ggg aaa agt gta 3 84 
Pro Phe Glu Ala Asp Val Thr Pro Tyr Val lie Ala Gly Lys Ser Val 
115 120 125 

cgt ate acc gtt tgt gtg aac aac gaa ctg aac tgg cag act ate ccg 432 
Arg He Thr Val Cys Val Asn Asn Glu Leu Asn Trp Gin Thr He Pro 
130 135 140 

ccg gga atg gtg att acc gac gaa aac ggc aag aaa aag cag tct tac 480 
Pro Gly Met Val lie Thr Asp Glu Asn Gly Lys Lys Lys Gin Ser Tyr 
145 150 155 160 

ttc cat gat ttc ttt aac tat gee gga ate cat cgc age gta atg etc 52 8 
Phe His Asp Phe Phe Asn Tyr Ala Gly lie His Arg Ser Val Met Leu 
165 170 175 

tac acc acg ccg aac acc tgg gtg gac gat ate acc gtg gtg acg cat 57 6 
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Tyr Thr Thar Pro Asn Thr Trp Val Asp Asp lie Thr Val Val Thr His 
180 185 190 

gtc gcg caa gac tgt aac cac gcg tct gtt gac tgg cag gtg gtg gcc 
Val Ala Gin Asp Cys Asn His Ala Ser Val Asp Trp Gin Val Val Ala 
195 ~ 200 205 

aat ggt gat gtc age gtt gaa ctg cgt gat gcg gat caa cag gtg gtt 
Asn Gly Asp Val Ser Val Glu Leu Arg Asp Ala Asp Gin Gin Val Val 
210 215 220 

gca act gga caa ggc act age ggg act ttg caa gtg gtg aat ccg cac 
Ala Thr Gly Gin Gly Thr Ser Gly Thr Leu Gin Val Val Asn Pro His 
225 230 235 240 



tea gtg gca gtg aag ggc gaa cag ttc ctg att aac cac aaa ccg ttc 
Ser Val Ala Val Lys Gly Glu Gin Phe Leu lie Asn His Lys Pro Phe 
275 280 285 



gga ttc gat aac gtg ctg atg gtg cac gac cac gca tta atg gac tgg 
Gly Phe Asp Asn Val Leu Met Val His Asp His Ala Leu Met Asp Trp 
305 310 315 320 

att ggg gcc aac tec tac cgt acc teg cat tac cct tac get gaa gag 
He Gly Ala Asn Ser Tyr Arg Thr Ser His Tyr Pro Tyr Ala Glu Glu 
325 330 335 



cgt ccg caa ggt gca egg gaa tat ttc gcg cca ctg gcg gaa gca acg 
Arg Pro Gin Gly Ala Arg Glu Tyr Phe Ala Pro Leu Ala Glu Ala Thr 
420 425 43 0 



624 



672 



720 



etc tgg caa ccg ggt gaa ggt tat etc tat gaa ctg tgc gtc aca gcc 7 68 
Leu Trp Gin Pro Gly Glu Gly Tyr Leu Tyr Glu Leu Cys Val Thr Ala 
245 250 255 

aaa age cag aca gag tgt gat ate tac ccg ctt cgc gtc ggc ate egg 816 
Lys Ser Gin Thr Glu Cys Asp He Tyr Pro Leu Arg Val Gly He Arg 
260 265 270 



864 



tac ttt act ggc ttt ggt cgt cat gaa gat gcg gac ttg cgt ggc aaa 912 
Tyr Phe Thr Gly Phe Gly Arg His Glu Asp Ala Asp Leu Arg Gly Lys 
290 295 300 



960 



1008 



atg etc gac tgg gca gat gaa cat ggc ate gtg gtg att gat gaa act 1056 

Met Leu Asp Trp Ala Asp Glu His Gly He Val Val He Asp Glu Thr 
340 345 350 

get get gtc ggc ttt aac etc tct tta ggc att ggt ttc gaa gcg ggc 1104 

Ala Ala Val Gly Phe Asn Leu Ser Leu Gly He Gly Phe Glu Ala Gly 
355 360 365 

aac aag ccg aaa gaa ctg tac age gaa gag gca gtc aac ggg gaa act 1152 

Asn Lys Pro Lys Glu Leu Tyr Ser Glu Glu Ala Val Asn Gly Glu Thr 

370 375 380 

cag caa gcg cac tta cag gcg att aaa gag ctg ata gcg cgt gac aaa 12 00 

Gin Gin Ala His Leu Gin Ala He Lys Glu Leu He Ala Arg Asp Lys 

385 390 395 400 

aac cac cca age gtg gtg atg tgg agt att gcc aac gaa ccg gat acc 124 8 

Asn His Pro Ser Val Val Met Trp Ser He Ala Asn Glu Pro Asp Thr 
405 410 415 



1296 



cgt aaa etc gac ccg acg cgt ccg ate acc tgc gtc aat gta atg ttc 1344 
Arg Lys Leu Asp Pro Thr Arg Pro He Thr Cys Val Asn Val Met Phe 
435 440 445 
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tgc gac get cac acc gat acc ate age gat etc ttt gat gtg ctg tgc 

Cys Asp Ala His Thr Asp Thr lie Sex Asp Leu Phe Asp Val Leu Cys 

450 455 460 



1392 



ctg aac cgt tat tac gga tgg tat gtc caa age ggc gat ttg gaa acg 1440 
Leu Asn Arg Tyr Tyr Gly Trp Tyr Val Gin Ser Gly Asp Leu Glu Thr 
465 ~ 470 475 480 

gca gag aag gta ctg gaa aaa gaa ctt ctg gec tgg cag gag aaa ctg 1488 
Ala Glu Lys Val Leu Glu Lys Glu Leu Leu Ala Trp Gin Glu Lys Leu 
485 490 495 

cat cag ccg att ate ate acc gaa tac ggc gtg gat acg tta gec ggg 153 6 
His Gin Pro lie lie lie Thr Glu Tyr Gly Val Asp Thr Leu Ala Gly 
500 505 510 

ctg cac tea atg tac acc gac atg tgg agt gaa gag tat cag tgt gca 15 84 
Leu His Ser Met Tyr Thr Asp Met Trp Ser Glu Glu Tyr Gin Cys Ala 
515 520 525 

tgg ctg gat atg tat cac cgc gtc ttt gat cgc gtc age gee gtc gtc 1632 
Trp Leu Asp Met Tyr His Arg Val Phe Asp Arg Val Ser Ala Val Val 
530 535 540 

ggt gaa cag gta tgg aat ttc gec gat ttt gcg acc teg caa ggc ata 1680 
Gly Glu Gin Val Trp Asn Phe Ala Asp Phe Ala Thr Ser Gin Gly lie 
545 550 555 560 

ttg cgc gtt ggc ggt aac aag aaa ggg ate ttc act cgc gac cgc aaa 1728 
Leu Arg Val Gly Gly Asn Lys Lys Gly lie Phe Thr Arg Asp Arg Lys 
565 570 575 

1776 



1812 



ccg aag teg 


gcg 


get 


ttt 


ctg 


ctg 


caa 


aaa 


cgc 


tgg 


act 


ggc 


atg 


aac 


Pro Lys Ser 


Ala 


Ala 


Phe 


Leu 


Leu 


Gin 


Lys 


Arg 


Trp 


Thr 


Gly 


Met 


Asn 


580 










585 










590 






ttc ggt gaa 


aaa 


ccg 


cag 


cag 


gga 


ggc 


aaa 


caa 


tga 










Phe Gly Glu 


Lys 


Pro 


Gin 


Gin 


Gly 


Gly 


Lys 


Gin 


* 










595 










600 


















<210> 106 




























<211> 603 




























<212> PRT 




























<213> Escherichia coli 






















<300> 




























<308> Genbank #S69414 






















<309> 1994-09-23 
























<400> 106 




























Met Leu Arg 


Pro 


Val 


Glu 


Thr 


Pro 


Thr 


Arg 


Glu 


lie 


Lys 


Lys 


Leu 


Asp 


1 




5 










10 










15 




Gly Leu Trp 


Ala 


Phe 


Ser 


Leu 


Asp 


Arg 


Glu 


Asn 


Cys 


Gly 


He 


Asp 


Gin 


20 










25 










30 






Arg Trp Trp 


Glu 


Ser 


Ala 


Leu 


Gin 


Glu 


Ser 


Arg 


Ala 


lie 


Ala 


Val 


Pro 


35 










40 










45 








Gly Ser Phe 


Asn 


Asp 


Gin 


Phe 


Ala 


Asp 


Ala 


Asp 


lie 


Arg 


Asn 


Tyr 


Ala 


50 








55 










60 










Gly Asn Val 


Trp 


Tyr 


Gin 


Arg 


Glu 


Val 


Phe 


He 


Pro 


Lys 


Gly 


Trp 


Ala 


65 


70 










75 










80 


Gly Gin Arg 


He 


Val 


Leu 


Arg 


Phe 


Asp 


Ala 


Val 


Thr 


His 


Tyr 


Gly 


Lys 




85 










90 










95 




Val Trp Val 


Asn 


Asn 


Gin 


Glu 


Val 


Met 


Glu 


His 


Gin 


Gly 


Gly 


Tyr 


Thr 


100 










105 










110 






Pro Phe Glu 


Ala 


Asp 


Val 


Thr 


Pro 


Tyr 


Val 


He 


Ala 


Gly 


Lys 


Ser 


Val 


115 










120 










125 
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Arg 


lie 


Thr 


Val 


Cys 


Val 


Asn 


Asn 


Glu 


Leu 


Asn 


Trp 


Gin 


Thr 


He 


Pro 


130 










135 










140 




Gin 






Pro 


Gly 


Met 


Val 


He 


Thr 


Asp 


Glu 


Asn 


Gly 


Lys 


Lys 


Lys 


Ser 


Tyr 


145 








150 










155 










160 


Phe 


His 


Asp 


Phe 


Phe 


Asn 


Tyr 


Ala 


Gly 


He 


His 


Arg 


Ser 


Val 


Met 


Leu 








165 










170 










175 


His 


Tyr 


Thr 


Thr 


Pro 


Asn 


Thr 


Trp 


Val 


Asp 


Asp 


lie 


Thr 


Val 


Val 


Thr 






180 










185 










190 






Val 


Ala 


Gin 


Asp 


Cys 


Asn 


His 


Ala 


Ser 


Val 


Asp 


Trp 


Gin 


Val 


Val 


Ala 






195 






200 










205 








Asn 


Gly 


Asp 


val 


Ser 


Val 


Glu 


Leu 


Arg 


Asp 


Ala 


Asp 


Gin 


Gin 


Val 


Val 




210 








215 










220 










Ala 


Thr 


Gly 


Gin 


Gly 


Thr 


Ser 


Gly 


Thr 


Leu 


Gin 


Val 


Val 


Asn 


Pro 


His 


225 






230 








235 










240 


Leu 


Trp 


Gin 


Pro 


Gly 


Glu 


Gly 


Tyr 


Leu 


Tyr 


Glu 


Leu 


Cys 


Val 


Thr 


Ala 








245 










250 










255 




Lye 


Ser 


Gin 


Thr 


Glu 


Cys 


Asp 


lie 


Tyr 


Pro 


Leu 


Arg 


Val 


Gly 


He 


Arg 






260 










265 










270 






Ser 


Val 


Ala 


Val 


Lys 


Gly 


Glu 


Gin 


Phe 


Leu 


lie 


Asn 


His 


Lys 


Pro 


Phe 






275 






280 










285 








Tyr 


Phe 


Thr 


Gly 


Phe 


Gly 


Arg 


His 


Glu 


Asp 


Ala 


Asp 


Leu 


Arg 


Gly 


Lys 


290 








295 










300 










Gly 


Phe 


Asp 


Asn 


Val 


Leu 


Met 


Val 


His 


Asp 


His 


Ala 


Leu 


Met 


Asp 


Trp 


305 








310 










315 










320 


lie 


Gly 


Ala 


Asn 


Ser 


Tyr 


Arg 


Thr 


Ser 


His 


Tyr 


Pro 


Tyr 


Ala 


Glu 


Glu 








325 










330 










335 




Met 


Leu 


Asp 


Trp 


Ala 


Asp 


Glu 


His 


Gly 


He 


Val 


Val 


He 


Asp 


Glu 


Thr 






340 










345 










350 






Ala 


Ala 


Val 


Gly 


Phe 


Asn 


Leu 


Ser 


Leu 


Gly 


He 


Gly 


Phe 


Glu 


Ala 


Gly 






355 








360 










365 








Asn 


Lys 


Pro 


Lys 


Glu 


Leu 


Tyr 


Ser 


Glu 


Glu 


Ala 


Val 


Asn Gly 


Glu 


Thr 




370 








375 










380 










Gin 


Gin 


Ala 


His 


Leu 


Gin 


Ala 


He 


Lys 


Glu 


Leu 


He 


Ala 


Arg 


Asp 


Lys 


385 










390 










395 










400 


Asn 


His 


Pro 


Ser 


Val 


Val 


Met 


Trp 


Ser 


lie 


Ala 


Asn 


Glu 


Pro 


Asp 


Thr 










405 










410 










415 




Arg 


Pro 


Gin 


Gly 


Ala 


Arg 


Glu 


Tyr 


Phe 


Ala 


Pro 


Leu 


Ala 


Glu 


Ala 


Thr 








420 










425 










430 






Arg 


Lys 


Leu 


Asp 


Pro 


Thr 


Arg 


Pro 


lie 


Thr 


Cys 


Val 


Asn 


Val 


Met 


Phe 






435 










440 










445 








Cys 


Asp 


Ala 


His 


Thr 


Asp 


Thr 


lie 


Ser 


Asp 


Leu 


Phe 


Asp 


Val 


Leu 


Cys 




450 










455 










460 










Leu 


Asn 


Arg 


Tyr 


Tyr 


Gly 


Trp 


Tyr 


Val 


Gin 


Ser 


Gly 


Asp 


Leu 


Glu 


Thr 


465 










470 










475 










480 


Ala 


Glu 


Lys 


Val 


Leu 


Glu 


Lys 


Glu 


Leu 


Leu 


Ala 


Trp 


Gin 


Glu 


Lys 


Leu 








485 








490 










495 




His 


Gin 


Pro 


lie 


He 


lie 


Thr 


Glu 


Tyr 


Gly Val 


Asp 


Thr 


Leu 


Ala 


Gly 








500 










505 










510 






Leu 


His 


Ser 


Met 


Tyr 


Thr 


Asp 


Met 


Trp 


Ser 


Glu 


Glu 


Tyr 


Gin 


Cys 


Ala 






515 








520 










525 








Trp 


Leu 


Asp 


Met 


Tyr 


His 


Arg 


Val 


Phe 


Asp 


Arg 


Val 


Ser 


Ala 


Val 


Val 




530 








535 










540 










Gly 


Glu 


Gin 


Val 


Trp 


Asn 


Phe 


Ala 


Asp 


Phe 


Ala 


Thr 


Ser 


Gin 


Gly 


He 


54 5 








550 










555 










560 


Leu 


Arg 


Val 


Gly 


Gly 


Asn 


Lys 


Lys 


Gly 


He 


Phe 


Thr 


Arg 


Asp 


Arg 


Lys 










565 










570 










575 




Pro 


Lys 


Ser 


Ala 


Ala 


Phe 


Leu 


Leu 


Gin 


Lys 


Arg 


Trp 


Thr Gly 


Met 


Asn 






580 










585 










590 






Phe 


Gly 


Glu 


Lys 


Pro 


Gin 


Gin 


Gly 


Gly 


Lys 


Gin 
















595 








600 



















<210> 107 

<211> 277 

<212> DNA 

<213> Artificial Sequence 
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<220> 

<22 3> Nopaline Synthase Terminator Sequence 
<300> 

<308> U09365 
<309> 1995-10-17 



<400> 107 

gagctcgaat ttccccgatc gttcaaacat ttggcaataa agtttcttaa gattgaatcc 60 
tgttgccggt cttgcgatga ttatcatata atttctgttg aattacgtta agcatgtaat 120 
aattaacatg taatgcatga cgttatttat gagatgggtt tttatgatta gagtcccgca 180 
attatacatt taatacgcga tagaaaacaa aatatagcgc gcaaactagg ataaattatc 24 0 
gcgcgcggtg tcatctatgt tactagatcg ggaattc 2 77 

<210> 108 
<211> 3451 
<212> DNA 

<213> Artificial Sequence 
<220> 

<22 3> Hindi I I Fragment containing the bet a -glucuronidase 
coding sequence, the rDNA intergenic spacer, and 
the Mastl sequence 



<400> 108 

aagcttgacc 

ttaggacgtg 

gtaggacgtg 

acgacttgaa 

gactccgcgg 

gttggtggtt 

gatcggcgat 

cacaatggag 

ttaagtattt 

taaatagttt 

tacttgatgt 

ttttttttgt 

tatttttgta 

gttgtacttc 

catgtctact 

tagactgaag 

tgacccccgc 

gttgaaggag 

aaaccattat 

gtcaaaaatg 

ttcactctca 

ttcactagtg 

cccgtgaaat 

gaattgagca 

gcagttttaa 

at cagcgcga 

atgcggtcac 

gcggctatac 

gtatcacagt 

ttaccgacga 

ggatccatcg 

tggtgacgca 

atggtgatgt 
gcaccagcgg 
tctatgaact 
tcggcatccg 
actttactgg 
tgctgatggt 
cgcattaccc 
t tgatgaaac 
acaagccgaa 
tacaggcgat 



tggaatatcg 
aaatatggcg 
gaatatggca 
aaatgacgaa 
gaattcgatt 

ggtggtcgtg 
ggttggtgtt 

gtgcgtcatg 
tacctatttt 
ttatcgtact 
attggaaatt 
tttattatgt 
aaatatatca 
tttttgtgca 
cctgtcactt 
gcgggaaacg 
cgatgacgcg 
ccactcagcc 
tgcgcgttca 
ctccactgac 
atccaaataa 
gatccccggg 
caaaaaactc 
gcgttggtgg 
cgatcagttc 
agtctttata 
tcattacggc 
gccatttgaa 
ttgtgtgaac 
aaacggcaag 
cagcgtaatg 
tgtcgcgcaa 
cagcgttgaa 
gactttgcaa 
gtacgtcaca 
gtcagtggca 
ctttggccgt 
gcacgatcac 
ttacgctgaa 
tgcagctgtc 
agaactgtac 
taaagagctg 



cgagtaaact 
aggaaaactg 
agaaaac t ga 
atcactaaaa 
gtgctagcca 
gctggcggtg 
tgcagcggtg 
gttattggtg 
ttacatattt 
tgttttataa 
ttctccattg 
attttttcgt 
tttacaatgt 
tttggtgttg 
gggttttttt 
acaatctgat 
ggacaagccg 
gcgggtttct 
aaagtcgcct 
gttccataaa 
tctgcaccgg 
tacggtcagt 
gacggcctgt 
gaaagcgcgt 
gccgatgcag 
ccgaaaggtt 
aaagtgtggg 
gccgatgt ca 
aacgaactga 
aaaaagcagt 
ctctacacca 
gactgtaacc 
ctgcgtgatg 
gtggtgaatc 
gccaaaagcc 
gtgaagggcg 
catgaagatg 
gcattaatgg 
gagatgctcg 
ggctttaacc 
agcgaagagg 
atagcgcgtg 



gaaaatcacg 
aaaaaggtgg 
aaatcatgga 
aacgtgaaaa 
atgtttaaca 
gt ggaaaat t 
tttgatatcg 
gttggtcatc 
tttattaaat 
aatattttat 
ttttttctat 
tttataataa 
ttaaaagtca 
tacatgtcta 
ttttaagaca 
catgagcgga 
ttttacgttt 
ggagtttaat 
aaggtcacta 
ttcccctcgg 
atctcgagat 
cccttatgtt 
gggcattcag 
tacaagaaag 
atattcgtaa 
gggcaggcca 
tcaataatca 
cgccgtatgt 
actggcagac 
cttacttcca 
cgccgaacac 
acgcgtctgt 
cggatcaaca 
cgcacctctg 
agacagagtg 
aacagttcct 
cggatttgcg 
actggattgg 
actgggcaga 
tctctttagg 
cagtcaacgg 
acaaaaacca 



gaaaatgaga 
aaaatttaga 
aaatgagaaa 
atgagaaatg 
agatgtcaag 
gcggtggttc 
gaatcactta 
tatatatttt 
ttatgcattg 
tattttatgt 
atttataata 
atatttatta 
tttgtgaata 
ttatgattct 
taatcactag 
gaattaaggg 
ggaactgaca 
gagctaagca 
tcagctagca 
tatccaatta 
cgaattcccg 
acgtcctgta 
tctggatcgc 
ccgggcaatt 
ttatgtgggc 
gcgtatcgtg 
ggaagtgatg 
tattgccggg 
tatcccgccg 
tgatttcttt 
ctgggtggac 
tgactggcag 

ggtggttgca 

gcaaccgggt 
tgatatctac 
gatcaaccac 
cggcaaagga 
ggccaactcc 
tgaacatggc 
cattggtttc 
ggaaactcag 
cccaagcgtg 



aatacacact 
aatgtccact 
catccacttg 
cacactgaag 
cacaatgaat 
gagcggtagt 
tggtggttgt 
tataataata 
tttgtatttt 
gttatattat 
attttcttat 
aaaaaaatat 
tattagctaa 
ctggccaaaa 
tgattatatc 
agtcacgtta 
gaaccgcaac 
catacgtcag 
aatatttctt 
gagtctcata 
cggccgcgaa 
gaaaccccaa 
gaaaactgtg 
gctgtgccag 
aacgtctggt 
ctgcgtttcg 
gagcatcagg 
aaaagtgtac 
ggaatggtga 
aactacgccg 
gatatcaccg 
gtggtggcca 
actggacaag 
gaaggttatc 
ccgctgcgcg 
aaaccgttct 
ttcgataacg 
taccgtacct 
atcgtggtga 
gaagcgggca 
caggcgcact 
gtgatgtgga 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 
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gtafcfcgccaa cgaaccggat acccgtccgc aaggtgcacg ggaafcatttc gcgccactgg 25 8 0 

cggaagcaac gcgtaaactc gatccgacgc gtccgatcac ctgcgtcaat gtaatgttct 264 0 

gcgacgctca caccgatacc atcagcgatc tctttgatgt gctgtgcctg aaccgttatt 2700 

acggttggta tgtccaaagc ggcgatttgg aaacggcaga gaaggtactg gaaaaagaac 2760 

ttctggccfcg gcaggagaaa ctgcatcagc cgattatcat caccgaatac ggcgtggata 2820 

cgttagccgg gctgcactca atgtacaccg acatgtggag tgaagagtat cagtgtgcat 2 8 80 

ggctggatat gtatcaccgc gtctttgatc gcgtcagcgc cgtcgtcggt gaacaggtat 2 94 0 

ggaatttcgc cgattttgcg acctcgcaag gcatattgcg cgtfcggcggt: aacaagaagg 3 00 0 

ggatctfccac ccgcgaccgc aaaccgaagt cggcggcttt tctgctgcaa aaacgctgga 3 060 

ctggcatgaa cttcggtgaa aaaccgcagc agggaggcaa acaatgaatc aacaactctc 312 0 

ctggcgcacc atcgtcggct acagcctcgg gaattgcgta ccgagctcga atttccccga 3180 

tcgtfccaaac atttggcaat aaagfcttctfc aagattgaat cctgttgccg gtcttgcgat 3240 

gattatcata taabttctgt tgaattacgt taagcatgta ataattaaca tgtaatgcat 33 0 0 

gacgttattt atgagatggg tttttatgat tagagtcccg caattataca tttaatacgc 33 6 0 

gatagaaaac aaaatatagc gcgcaaacta ggataaatta tcgcgcgcgg tgtcatctat 3420 
gttactagat cgggaattcg atatcaagct t 3451 

<210> 109 
<211> 14627 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pAglla Plasmid 
<400> 109 

catgccaacc acagggttcc cctcgggatc aaagtacttt gatccaaccc ctccgctgct 60 

atagtgcagt cggcttctga cgttcagtgc agccgtcttc tgaaaacgac atgtcgcaca 12 0 

agtcctaagt tacgcgacag gctgccgccc tgcccttttc ctggcgtttt cttgtcgcgt 180 

gttttagtcg cataaagtag aatacttgcg actagaaccg gagacattac gccatgaaca 240 

agagcgccgc cgctggcctg ctgggctatg cccgcgtcag caccgacgac caggacttga 300 

ccaaccaacg ggccgaactg cacgcggccg gctgcaccaa gctgttttcc gagaagatca 3 60 

ccggcaccag gcgcgaccgc ccggagctgg ccaggatgct tgaccaccta cgccctggcg 420 

acgttgtgac agtgaccagg ctagaccgcc tggcccgcag cacccgcgac ctactggaca 4 80 

ttgccgagcg catccaggag gccggcgcgg gcctgcgtag cctggcagag ccgtgggccg 540 

acaccaccac gccggccggc cgcatggtgt tgaccgtgtt cgccggcatt gccgagttcg 600 

agcgttccct aatcatcgac cgcacccgga gcgggcgcga ggccgccaag gcccgaggcg 660 

tgaagtttgg cccccgccct accctcaccc cggcacagat cgcgcacgcc cgcgagctga 72 0 

tcgaccagga aggccgcacc gtgaaagagg cggctgcact gcttggcgtg catcgctcga 780 

ccctgtaccg cgcacttgag cgcagcgagg aagtgacgcc caccgaggcc aggcggcgcg 840 

gfcgccttccg tgaggacgca ttgaccgagg ccgacgccct ggcggccgcc gagaatgaac 900 

gccaagagga acaagcatga aaccgcacca ggacggccag gacgaaccgt ttttcattac 960 

cgaagagatc gaggcggaga tgatcgcggc cgggtacgfcg tfccgagccgc ccgcgcacgt 1020 

ctcaaccgtg cggctgcafcg aaatcctggc cggtttgtct gatgccaagc tggcggcctg 1080 

gccggccagc ttggccgctg aagaaaccga gcgccgccgt ctaaaaaggt gatgfcgtatt 1140 

tgagtaaaac agcttgcgtc atgcggtcgc tgcgtatatg atgcgatgag taaataaaca 12 00 

aatacgcaag gggaacgcat gaaggttatc gctgtactta accagaaagg cgggtcaggc 12 60 

aagacgacca tcgcaaccca tctagcccgc gccctgcaac tcgccggggc cgatgttctg 132 0 

ttagtcgatt ccgatcccca gggcagtgcc cgcgattggg cggccgtgcg ggaagatcaa 13 80 

ccgctaaccg ttgtcggcat cgaccgcccg acgattgacc gcgacgtgaa ggccatcggc 144 0 

cggcgcgact tcgtagtgat cgacggagcg ccccaggcgg cggacttggc tgtgtccgcg 15 0 0 

atcaaggcag ccgacttcgt gctgattccg gtgcagccaa gcccttacga catatgggcc 1560 

accgccgacc tggtggagct ggttaagcag cgcattgagg tcacggatgg aaggctacaa 162 0 

gcggcctttg tcgtgtcgcg ggcgatcaaa ggcacgcgca tcggcggtga ggttgccgag 1680 

gcgctggccg ggtacgagct gcccattctt gagtcccgta tcacgcagcg cgtgagctac 174 0 

ccaggcactg ccgccgccgg cacaaccgtt cttgaatcag aacccgaggg cgacgctgcc 1800 

cgcgaggtcc aggcgctggc cgctgaaatt aaatcaaaac tcatttgagt taatgaggta 1860 

aagagaaaat gagcaaaagc acaaacacgc taagtgccgg ccgtccgagc gcacgcagca 1920 

gcaaggctgc aacgttggcc agcctggcag acacgccagc catgaagcgg gtcaactttc 1980 

agttgccggc ggaggatcac accaagctga agatgtacgc ggtacgccaa ggcaagacca 2 04 0 

tfcaccgagct gctatctgaa tacatcgcgc agctaccaga gtaaatgagc aaatgaataa 210O 

atgagtagat gaattttagc ggctaaagga ggcggcatgg aaaatcaaga acaaccaggc 216 0 

accgacgccg tggaatgccc catgtgtgga ggaacgggcg gttggccagg cgtaagcggc 2220 

tgggttgtcfc gccggccctg caatggcact ggaaccccca agcccgagga atcggcgtga 22 80 

cggtcgcaaa ccatccggcc cggtacaaat cggcgcggcg ctgggtgatg acctggtgga 2340 

gaagttgaag gccgcgcagg ccgcccagcg gcaacgcatc gaggcagaag cacgccccgg 2400 

tgaatcgtgg caagcggccg ctgatcgaat ccgcaaagaa tcccggcaac cgccggcagc 24 60 
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cggtgcgccg tcgattagga agccgcccaa 
gatgctctat gacgtgggca cccgcgatag 
tctgtcgaag cgtgaccgac gagctggcga 
cgtagaggtt tccgcagggc cggccggcat 
gatggcggtt tcccatctaa ccgaatccat 
gcccggccgc gtgttccgtc cacacgttgc 
tggcggaaag cagaaagacg acctggtaga 
tgccatgcag cgtacgaaga aggccaagaa 
agccttgatt agccgctaca agatcgtaaa 
gatcgagcta gctgattgga tgtaccgcga 
gacggttcac cccgattact fctttgatcga 
ggcacgccgc gccgcaggca aggcagaagc 
cagtggcagc gccggagagt tcaagaagtt 
aaatgacctg ccggagtacg atttgaagga 
catgcgctac cgcaacctga tcgagggcga 
gatgcfcaggg caaattgccc tagcagggga 
tagcacgtac attgggaacc caaagccgta 
cccaaagccg tacattggga accggtcaca 
aggcgatttt tccgcctaaa actctttaaa 
ctgtgcataa ctgtctggcc agcgcacagc 
gtcgctgcgc tccctacgcc ccgccgcttc 
aaaaatggct ggcctacggc caggcaatct 
actcgaccgc cggcgcccac atcaaggcac 
aaaacctctg acacatgcag ctcccggaga 
ggagcagaca agcccgtcag ggcgcgtcag 
tgacccagtc acgtagcgat agcggagtgt 
gattgtactg agagtgcacc atatgcggtg 
ataccgcatc aggcgctctt ccgcttcctc 
gctgcggcga gcggtatcag ctcactcaaa 
ggataacgca ggaaagaaca tgtgagcaaa 
ggccgcgttg ctggcgtttt tccataggct 
acgctcaagt cagaggtggc gaaacccgac 
tggaagctcc ctcgtgcgct ctcctgttcc 
ctttctccct tcgggaagcg tggcgctttc 
ggtgtaggtc gttcgctcca agctgggctg 
ctgcgcctta tccggtaact atcgtcttga 
actggcagca gccactggta acaggattag 
gttcttgaag tggtggccta actacggcta 
tctgctgaag ccagttacct tcggaaaaag 
caccgctggt agcggtggtt tttttgtttg 
atctcaagaa gatcctttga tcttttctac 
acgttaaggg attttggtca tgcattctag 
atattttatt ttctcccaat caggcttgat 
ctgttcttcc ccgatatcct ccctgatcga 
gtccgccctg ccgcttctcc caagatcaat 
gatgttgctg tctcccaggt cgccgtggga 
ctttaaaaaa tcatacagct cgcgcggatc 
gcaatccaca tcggccagat cgttattcag 
taagctattc gtatagggac aatccgatat 
cgcatacagc tcgataatct tttcagggct 
gacgccatcg gcctcactca tgagcagatt 
gacctttgga acaggcagct ttccttccag 
atcataggtg gtccctttat accggctgtc 
tcccaccagc ttatatacct tagcaggaga 
tttttcgatc agttttttca attccggtga 
tcctcttttc tacagtattt aaagataccc 
aattcactgt tccttgcatt ctaaaacctt 
ttttcaaagt tggcgtataa catagtatcg 
caggcagcaa cgctctgtca tcgttacaat 
gtttcaaacc cggcagctta gttgccgttc 
tctgccgcct tacaacggct ctcccgctga 
cgagtggtga ttttgtgccg agctgccggt 
tatafctgfcgg tgtaaacaaa ttgacgctta 
taatgtactg aattaacgcc gaattaattc 
gttttaggaa ttagaaattt tattgataga 
ggtttcttat atgctcaaca catgagcgaa 
ggaactactc acacattatt atggagaaac 



gggcgacgag caaccagatt ttttcgttcc 252 0 
tcgcagcatc atggacgtgg ccgttfctccg 2580 
ggtgatccgc fcacgagcttc cagacgggca 2 64 0 
ggccagtgtg fcgggafctacg acctggtact 2700 
gaaccgatac cgggaaggga agggagacaa 276 0 
ggacgtactc aagttctgcc ggcgagccga 2 82 0 
aacctgcatt cggttaaaca ccacgcacgt 2 880 
cggccgccfcg gtgacggtat ccgagggtga 294 0 
gagcgaaacc gggcggccgg agtacatcga 3 00 0 
gatcacagaa ggcaagaacc cggacgtgct 3 060 
tcccggcatc ggccgtttfcc tctaccgcct 3120 
cagatggttg ttcaagacga tctacgaacg 3180 
ctgtttcacc gtgcgcaagc tgatcgggtc 3 24 0 
ggaggcgggg caggctggcc cgatcctagt 3 3O0 
agcatccgcc ggttcctaat gtacggagca 3 360 
aaaaggtcga aaaggtctct ttcctgtgga 3420 
cattgggaac cggaacccgt acattgggaa 34 8 0 
catgtaagtg actgatataa aagagaaaaa 3 54 0 
acttattaaa actcttaaaa cccgcctggc 3600 
cgaagagctg caaaaagcgc ctacccttcg 3 660 
gcgtcggcct atcgcggccg ctggccgctc 3 72 0 
accagggcgc ggacaagccg cgccgtcgcc 37 80 
cctgcctcgc gcgtttcggt gatgacggtg 3 840 
cggtcacagc ttgtctgtaa gcggatgccg 3 900 
cgggtgttgg cgggtgtcgg ggcgcagcca 3 96 0 
atactggctt aactatgcgg catcagagca 4 02 0 
tgaaataccg cacagatgcg taaggagaaa 4 0 80 
gctcactgac tcgctgcgct cggtcgttcg 4140 
ggcggtaata cggttatcca cagaatcagg 4200 
aggccagcaa aaggccagga accgtaaaaa 4260 
ccgcccccct gacgagcatc acaaaaatcg 4320 
aggactataa agataccagg cgtttccccc 43 80 
gaccctgccg cttaccggat acctgtccgc 444 0 
tcatagctca cgctgtaggt atctcagttc 4500 
tgtgcacgaa ccccccgttc agcccgaccg 456 0 
gtccaacccg gtaagacacg acttatcgcc 4 62 0 
cagagcgagg tatgtaggcg gtgctacaga 4 680 
cactagaagg acagtatttg gtatctgcgc 4740 
agttggtagc tcttgatccg gcaaacaaac 4 800 
caagcagcag attacgcgca gaaaaaaagg 48 60 
ggggtctgac gctcagtgga acgaaaactc 4 92 0 
gtactaaaac aattcatcca gtaaaatata 4 980 
ccccagtaag tcaaaaaata gctcgacata 504 0 
ccggacgcag aaggcaatgt cataccactt 5100 
aaagccactt actttgccat ctttcacaaa 5160 
aaagacaagt tcctcttcgg gcttttccgt 5220 
tttaaatgga gtgtcttctt cccagttttc 5280 
taagtaatcc aattcggcta agcggctgtc 534 0 
gtcgafcggag tgaaagagcc tgatgcactc 54 0 0 
ttgttcatct tcatactctt ccgagcaaag 5460 
gctccagcca tcatgccgtt caaagtgcag 552 0 
ccatagcatc atgtcctttt cccgttccac 5580 
cgtcattttt aaatataggt tttcattttc 5640 
cattccttcc gtatctttta cgcagcggta 5700 
tattctcatt ttagccattt attatttcct 5760 
caagaagcta attataacaa gacgaactcc 5820 
aaataccaga aaacagcttt ttcaaagttg 5880 
acggagccga ttttgaaacc gcggtgatca 5 94 0 
caacatgcta ccctccgcga gatcatccgt 6000 
ttccgaatag catcggtaac atgagcaaag 606 0 
cgccgtcccg gactgatggg ctgcctgtat 612 0 
cggggagctg ttggctggct ggtggcagga 618 0 
gacaacttaa taacacattg cggacgtttt 624 0 
gggggatctg gattttagta ctggattttg 6300 
agtattttac aaatacaaat acatactaag 63 60 
accctatagg aaccctaatt cccttatctg 6420 
tcgagtcaaa tctcggtgac gggcaggacc 64 80 



ggacggggcg gtaccggcag gctgaagtcc 
ccgtgcttga agccggccgc ccgcagcatg 
atgcgcacgc tcgggtcgtt gggcagcccg 
gcctccaggg acttcagcag gtgggtgtag 
cggggggaga cgtacacggt cgactcggcc 
gggcccgcgt aggcgatgcc ggcgacctcg 
cgctcccgca gacggacgag gtcgtccgtc 
aagttgaccg tgcttgtctc gatgtagtgg 
gcctcggtgg cacggcggat gtcggccggg 
gagatagatt tgtagagaga gactggtgat 
ttccttatat agaggaaggt cttgcgaagg 
agtggagata tcacatcaat ccacttgctt 
cacgatgctc ctcgfcgggtg ggggfcccatc 
aacgatagcc tttcctttat cgcaafcgatg 
tgtcctfcttg atgaagtgac agatagctgg 
taccctttgt tgaaaagtct caatagccct 
cttggagtag acgagagtgt cgtgctccac 
agacgtggtt ggaacgtctt ctttttccac 
gggaccactg tcggcagagg catcttgaac 
tttgtaggtg ccaccttcct tttctactgt 
atggaatccg aggaggtttc ccgatattac 
gtcttctgag actgtatctt tgatattctt 
gttggcaagc tgctctagcc aatacgcaaa 
taatgcagct ggcacgacag gtttcccgac 
aatgtgagtt agctcactca ttaggcaccc 
atgttgtgtg gaattgtgag cggataacaa 
tacgaattcg agccttgact agagggtcga 
gagttfcggac aaaccacaac fcagaatgcag 
gatgcfcafcfcg ctttatttgt aaccattata 
gaactccagc atgagatccc cgcgctggag 
tccgaagccc aacctttcat agaaggcggc 
gtcctgctcc tcggccacga agtgcacgca 
ccgcccccac ggcfcgcfccgc cgatctcggt 
cgtggacacg acctccgacc actcggcgta 
ggccagggtg ttgtccggca ccacctggtc 
gtcccggacc acaccggcga agtcgtcctc 
ggtccagaac tcgaccgctc cggcgacgtc 
caacttggcc atggatccag atttcgctca 
gcaggaattc gatcgacact ctcgtctact 
accaaagggc tattgagact tttcaacaaa 
attgcccagc tatctgtcac ttcatcaaaa 
aatgccatca ttgcgataaa ggaaaggcta 
ccaaagatgg acccccaccc acgaggagca 
cttcaaagca agtggattga tgfcgafcaaca 
agaatatcaa agatacagtc tcagaagacc 
taatatcggg aaacctcctc ggattccatt 
cagtagaaaa ggaaggtggc acctacaaat 
ttcaagatgc ctctgccgac agtggtccca 
tggaaaaaga agacgttcca accacgtctt 
ctgacgtaag ggatgacgca caatcccact 
aagttcattt catttggaga ggacacgctg 
tctctcgagc tttcgcagat ccgggggggc 
cgacgtctgt cgagaagttt ctgatcgaaa 
tctcggaggg cgaagaatct cgtgctttca 
tgcgggtaaa tagctgcgcc gatggtttct 
catcggccgc gctcccgatt ccggaagtgc 
cctattgcat ctcccgccgt gcacagggtg 
tgcccgctgt tctacaaccg gtcgcggagg 
gccagacgag cgggttcggc ccattcggac 
gtgatttcat atgcgcgatt gctgatcccc 
acaccgtcag tgcgtccgtc gcgcaggctc 
gccccgaagt ccggcacctc gtgcacgcgg 
atggccgcat aacagcggtc attgactgga 
aggtcgccaa catcttcttc tggaggccgt 
acttcgagcg gaggcatccg gagcttgcag 
gcattggtct tgaccaactc tatcagagct 
gggcgcaggg tcgatgcgac gcaatcgtcc 
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agctgccaga aacccacgtc atgccagttc 654 0 
ccgcgggggg catatccgag cgcctcgtgc 6600 
atgacagcga ccacgctcfct gaagccctgt 6660 
agcgtggagc ccagtcccgt ccgctggtgg 6720 
gtccagtcgt aggcgttgcg tgccttccag 67 80 
ccgtccacct cggcgacgag ccagggatag 6840 
cactcctgcg gttcctgcgg ctcggtacgg 6900 
ttgacgatgg tgcagaccgc cggcatgtcc 696 0 
cgtcgttctg ggctcatggt agactcgaga 702 0 
ttcagcgtgt cctctccaaa tgaaatgaac 70 80 
atagtgggab tgtgcgtcat cccttacgtc 714 O 
tgaagacgtg gttggaacgt cttctttttc 7200 
tttgggacca ctgtcggcag aggcatcttg 7260 
gcatttgtag gtgccacctt ccttttctac 7320 
gcaatggaat ccgaggaggt ttcccgatat 73 80 
ttggtcttct gagactgtat ctttgatatt 744 0 
catgttatca catcaatcca cttgctttga 7500 
gatgctcctc gtgggtgggg gtccatcttt 7560 
gatagccttt cctttatcgc aatgatggca 7620 
ccttttgatg aagtgacaga tagctgggca 768 0 
cctttgttga aaagtctcaa tagccctttg 7740 
ggagtagacg agagtgtcgt gctccaccat 78 00 
ccgcctctcc ccgcgcgttg gccgattcat 7860 
tggaaagcgg gcagtgagcg caacgcaatt 7 92 0 
caggctttac actttatgct tccggctcgt 7980 
tttcacacag gaaacagcta tgaccatgat 8040 
cggtatacag acatgataag atacattgat 8100 
tgaaaaaaat gctttatttg tgaaatttgt 8160 
agctgcaata aacaagttgg ggtgggcgaa 822 0 
gatcatccag ccggcgtccc ggaaaacgat 82 80 
ggtggaatcg aaatctcgta gcacgtgtca 834 0 
gttgccggcc gggtcgcgca gggcgaactc 84 0 0 
catggccggc ccggaggcgt cccggaagtt 8460 
cagctcgtcc aggccgcgca cccacaccca 8520 
ctggaccgcg ctgatgaaca gggtcacgtc 8580 
cacgaagtcc cgggagaacc cgagccggtc 8 64 0 
gcgcgcggtg agcaccggaa cggcactggt 8700 
agttagtata aaaaagcagg cttcaatcct 8760 
ccaagaatat caaagataca gtctcagaag 882 0 
gggtaatatc gggaaacctc ctcggattcc 88 80 
ggacagtaga aaaggaaggt ggcacctaca 8940 
tcgttcaaga tgcctctgcc gacagtggtc 9000 
tcgtggaaaa agaagacgtt ccaaccacgt 90 60 
tggtggagca cgacactctc gtctactcca 9120 
aaagggctat tgagactttt caacaaaggg 9180 
gcccagctat ctgtcacttc atcaaaagga 9240 
gccatcattg cgataaagga aaggctatcg 93 OO 
aagatggacc cccacccacg aggagcatcg 93 60 
caaagcaagt ggattgatgt gatatctcca 942 0 
atccttcgca agaccttcct ctatataagg 94 80 
aaatcaccag tctctctcta caaatctatc 9540 
aatgagatat gaaaaagcct gaactcaccg 9600 
agttcgacag cgtctccgac ctgatgcagc 9660 
gcttcgatgt aggagggcgt ggatatgtcc 9720 
acaaagatcg ttatgtttat cggcactttg 9780 
ttgacattgg ggagtttagc gagagcctga 9840 
tcacgttgca agacctgcct gaaaccgaac 9900 
ctatggatgc gatcgctgcg gccgatctta 9960 
cgcaaggaat cggtcaatac actacatggc 10 020 
atgtgtatca ctggcaaact gtgatggacg 10 0 80 
tcgatgagct gatgctttgg gccgaggact 10140 
atttcggctc caacaatgtc ctgacggaca 102 00 
gcgaggcgat gttcggggat tcccaatacg 102 60 
ggttggcttg tatggagcag cagacgcgct 10320 
gatcgccacg actccgggcg tatatgctcc 103 80 
tggttgacgg caatttcgat gatgcagctt 10440 
gatccggagc cgggactgtc gggcgtacac 10500 
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aaatcgcccg 
gtggaaaccg 
atctgtcgat 
ggaattaggg 
gtatttgtat 
agtactaaaa 
gaatatcgcg 
atatggcgag 
atatggcaag 
atgacgaaat 
attcgattgt 
tggtcgtggc 
ttggtgtttg 
gcgtcatggt 
cctatttttt 
atcgtacttg 
tggaaatttt 
tattatgtat 
atatatcatt 
tfctgtgcatt 
tgtcacttgg 
gggaaacgac 
atgacgcggg 
actcagccgc 
cgcgttcaaa 
ccactgacgt 
ccaaataatc 
tccccgggta 
aaaaactcga 
gttggtggga 
atcagttcgc 
tctttatacc 
attacggcaa 
catttgaagc 
gtgtgaacaa 
acggcaagaa 
gcgtaatgct 
tcgcgcaaga 
gcgttgaact 
ctttgcaagt 
acgtcacagc 
cagtggcagt 
ttggccgtca 
acgatcacgc 
acgctgaaga 
cagctgtcgg 
aactgtacag 
aagagctgat 
aaccggatac 
gtaaactcga 
c cga t ac cat 
tccaaagcgg 
aggagaaact 
tgcactcaat 
atcaccgcgt 
attttgcgac 
gcgaccgcaa 
tcggtgaaaa 
cgtcggctac 
ttggcaataa 
atttctgttg 
gagatgggtt 
aatatagcgc 
ggaattcgat 
ctggcgttac 
gcgaagaggc 
agagcagctt 



cagaagcgcg 
acgccccagc 
cgacaagctc 
ttcctatagg 
ttgtaaaata 
tccagatccc 
agtaaactga 
gaaaactgaa 
aaaactgaaa 
cactaaaaaa 
gctagccaat 
tggcggtggt 
cagcggtgtt 
tattggtggt 
acatattttt 
ttttataaaa 
ctccattgtt 
tttttcgttt 
tacaatgttt 
tggtgttgta 
gttttttttt 
aatctgatca 
acaagccgtt 
gggtttctgg 
agtcgcctaa 
tccataaatt 
tgcaccggat 
cggtcagtcc 
cggcctgtgg 
aagcgcgtta 
cgatgcagat 
gaaaggttgg 
agtgtgggtc 
cgatgtcacg 
cgaactgaac 
aaagcagtct 
ctacaccacg 
ctgtaaccac 
gcgtgatgcg 
ggtgaatccg 
caaaagccag 
gaagggcgaa 
tgaagatgcg 
attaatggac 
gatgctcgac 
ctttaacctc 
cgaagaggca 
agcgcgtgac 
ccgtccgcaa 
tccgacgcgt 
cagcgatctc 
cgatttggaa 
gcatcagccg 
gtacaccgac 
ctttgatcgc 
ctcgcaaggc 
accgaagtcg 
accgcagcag 
agcctcggga 
agtttcttaa 
aattacgtta 
tttatgatta 
gcaaactagg 
atcaagcttg 
ccaacttaat 
ccgcaccgat 
gagcttggat 



gccgtctgga 
actcgtccga 
gagtttctcc 
gtttcgctca 
cttctatcaa 
ccgaattaat 
aaatcacgga 
aaaggtggaa 
atcatggaaa 
cgtgaaaaat 
gtttaacaag 
ggaaaattgc 
tgatatcgga 
tggtcatcta 
tattaaattt 
tattttatta 
ttttctatat 
tataataaat 
aaaagtcatt 
catgtctatt 
ttaagacata 
tgagcggaga 
ttacgtttgg 
agtttaatga 
ggtcactatc 
cccctcggta 
ctcgagatcg 
cttatgttac 
gcattcagtc 
caagaaagcc 
attcgtaatt 
gcaggccagc 
aataatcagg 
ccgtatgtta 
tggcagacta 
tacttccatg 
ccgaacacct 
gcgtctgttg 
gatcaacagg 
cacctctggc 
ac agagt g tg 
cagttcctga 
gatttgcgcg 
tggattgggg 
tgggcagatg 
tctttaggca 
gtcaacgggg 
aaaaaccacc 
ggtgcacggg 
ccgatcacct 
tttgatgtgc 
acggcagaga 
attatcatca 
atgtggagtg 
gtcagcgccg 
atattgcgcg 
gcggcttttc 
ggaggcaaac 
attgcgtacc 
gattgaatcc 
agcatgtaat 
gagtcccgca 
ataaattatc 
gcactggccg 
cgccttgcag 
cgcccttccc 
cagattgtcg 



ccgatggctg 
gggcaaagaa 
ataataatgt 
tgtgttgagc 
taaaatttct 
tcggcgttaa 
aaatgagaaa 
aatttagaaa 
atgagaaaca 
gagaaatgca 
atgtcaagca 
ggtggttcga 
atcacttatg 
tatattttta 
atgcattgtt 
ttttatgtgt 
ttataataat 
atttattaaa 
tgtgaatata 
atgattctct 
atcactagtg 
attaagggag 
aactgacaga 
gctaagcaca 
agctagcaaa 
tccaattaga 
aattcccgcg 
gtcctgtaga 
tggatcgcga 
gggcaat t gc 
atgtgggcaa 
gtatcgtgct 
aagtgatgga 
ttgccgggaa' 
tcccgccggg 
atttctttaa 

gggtggacga 

actggcaggt 
tggttgcaac 
aaccgggtga 
atatctaccc 
tcaaccacaa 
gcaaaggatt 
ccaactccta 
aacatggcat 
ttggtttcga 
aaactcagca 
caagcgtggt 
aatatttcgc 
gcgtcaatgt 
tgtgcctgaa 
aggtactgga 
ccgaatacgg 
aagagtatca 
tcgtcggtga 
ttggcggtaa 
tgctgcaaaa 
aatgaatcaa 
gagctcgaat 
tgttgccggt 
aattaacafcg 
attatacatt 
gcgcgcggtg 
tcgttttaca 
cacatccccc 
aacagttgcg 
tttcccgcct 



t gt agaagta 
at agagt aga 
gtgagtagtt 
atataagaaa 
aattcctaaa 
ttcagatcaa 
tacacacttt 
tgtccactgt 
tccacttgac 
cactgaagga 
caatgaatgt 
gcggtagtga 
gtggttgtca 
taataatatt 
tgtattttta 
tatattatta 
tttcttattt 
aaaaatatta 
ttagctaagt 
ggccaaaaca 
attatatcta 
tcacgttatg 
accgcaacgt 
tacgtcagaa 
tatttcttgt 
gtctcatatt 
gccgcgaatt 
aaccccaacc 
aaactgtgga 
tgtgccaggc 
cgtctggtat 
gcgtttcgat 
gc a t c agggc 
aagtgtacgt 
aatggtgatt 
ctacgccggg 
tatcaccgtg 
ggtggccaat 
tggacaaggc 
aggttatctc 
gctgcgcgtc 
accgttctac 
cgataacgtg 
ccgtacctcg 
cgtggtgatt 
agcgggcaac 
ggcgcactta 
gatgtggagt 
gccactggcg 
aatgttctgc 
ccgttattac 
aaaagaactt 
cgtggatacg 
gtgtgcatgg 
acaggtatgg 
caagaagggg 
acgctggact 
caactctcct 
ttccccgatc 
cttgcgatga 
taatgcatga 
taatacgcga 
tcatctatgt 
acgtcgtgac 
tttcgccagc 
cagcctgaat 
tcagtttaaa 



ctcgccgata 
tgccgaccgg 
cccagataag 
cccttagtat 
accaaaatcc 
gcttgacctg 
aggacgtgaa 
aggacgtgga 
gacttgaaaa 
c t ccgcggga 
tggtggttgg 
tcggcgatgg 
caatggaggt 
aagtatttta 
aatagttttt 
cttgatgtat 
ttttttgttt 
tttttgtaaa 
tgtacttctt 
tgtctactcc 
gactgaaggc 
acccccgccg 
tgaaggagcc 
accattattg 
caaaaatgct 
cactctcaat 
cactagtgga 
cgtgaaatca 
attgagcagc 
agttttaacg 
cagcgcgaag 
gcggtcactc 
ggctatacgc 
atcacagttt 
accgacgaaa 
atccatcgca 
gtgacgcatg 
ggtgatgtca 
accagcggga 
tatgaactgt 
ggcatccggt 
tttactggct 
ctgatggtgc 
cattaccctt 
gatgaaactg 
aagccgaaag 
caggcgatta 
attgccaacg 
gaagcaacgc 
gacgctcaca 
ggttggtatg 
ctggcctggc 
ttagccgggc 
ctggatatgt 
aatttcgccg 
atcttcaccc 
ggcatgaact 
ggcgcaccat 
gttcaaacat 
ttatcatata 
cgttatttat 
tagaaaacaa 
tactagatcg 
tgggaaaacc 
tggcgtaata 
ggcgaatgct 
ctat cagtgt 



1O560 
10620 
10680 
1O740 
10800 
10860 
10920 
10980 
11040 
11100 
11160 
11220 
11280 
11340 
11400 
11460 
11520 
11580 
11640 
11700 
11760 
11820 
11880 
1194 0 
12000 
12060 
12120 
12180 
12240 
12300 
12360 
12420 
12480 
12540 
12600 
12660 
12720 
12780 
12840 
12900 
12960 
13020 
13080 
13140 
13200 
13260 
13320 
13380 
13440 
13500 
13560 
13620 
13680 
13740 
1380O 
13860 
13920 
13980 
14040 
1410O 
14160 
14220 
14280 
14340 
14400 
14460 
14520 
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ttgacaggat atattggcgg gtaaacctaa gagaaaagag cgtttattag aataacggat 1458 0 
atttaaaagg gcgtgaaaag gtttatccgt tcgtccattt gtatgtg 1462 7 

<210> 110 
<211> 9080 
<212> DMA 

<213> Artificial Sequence 
<220> 

<223> plBattBZeo ( 6XHS4 ) 2eGFP Plasmid 
<400> 110 

cagttgccgg ccgggtcgcg cagggcgaac tcccgccccc acggctgctc gccgatctcg 60 
gtcatggccg gcccggaggc gtcccggaag ttcgtggaca cgacctccga ccactcggcg 12 0 
tacagctcgt ccaggccgcg cacccacacc caggccaggg tgttgtccgg caccacctgg 180 
tcctggaccg cgctgatgaa cagggtcacg tcgtcccgga ccacaccggc gaagtcgtcc 240 
tccacgaagt cccgggagaa cccgagccgg tcggtccaga actcgaccgc tccggcgacg 300 
tcgcgcgcgg tgagcaccgg aacggcactg gtcaacttgg ccatggatcc agatttcgct 360 
caagttagta taaaaaagca ggcttcaatc ctgcagagaa gcttgatatc gaattcctgc 420 
agccccgcgg atccgctcac ggggacagcc cccccccaaa gcccccaggg atgtaattac 480 
gtccctcccc cgctaggggg cagcagcgag ccgcccgggg ctccgctccg gtccggcgct 540 
ccccccgcat ccccgagccg gcagcgtgcg gggacagccc gggcacgggg aaggtggcac 600 
gggatcgctt tcctctgaac gcttctcgct gctctttgag cctgcagaca cctgggggat 660 
acggggccgc ggatccgctc acggggacag ccccccccca aagcccccag ggatgtaatt 72 0 
acgtccctcc cccgctaggg ggcagcagcg agccgcccgg ggctccgctc cggtccggcg 780 
ctccccccgc atccccgagc cggcagcgtg cggggacagc ccgggcacgg ggaaggtggc 84 0 
acgggatcgc tttcctctga acgcttctcg ctgctctttg agcctgcaga cacctggggg 900 
atacggggcc gcggatccgc tcacggggae agcccccccc caaagccccc agggatgtaa 960 
ttacgtccct cccccgctag ggggcagcag cgagccgccc ggggctccgc tccggtccgg 102 0 
cgctcccccc gcatccccga gccggcagcg tgcggggaca gcccgggcac ggggaaggtg 10 80 
gcacgggatc gctttcctct gaacgcttct cgctgctctt tgagcctgca gacacctggg 1140 
ggatacgggg ccgcggatcc gctcacgggg acagcccccc cccaaagccc ccagggatgt 12 00 
aattacgtcc ctcccccgct agggggcagc agcgagccgc ccggggctcc get c egg tec 1260 
ggcgctcccc ccgcatcccc gagceggcag cgtgcgggga cagcccgggc aeggggaagg 1320 
tggcacggga tcgctttcct ctgaacgett ctcgctgctc tttgagcctg cagacacctg 13 80 
ggggataegg ggecgeggat ccgctcacgg ggacagcccc cccccaaagc ccccagggat 144 0 
gtaattacgt ccctcccccg ctagggggca gcagcgagcc gcccggggct ccgctccggt 15 0 0 
ccggcgctcc ccccgcatcc ccgagccggc agcgtgcggg gacagcccgg geaeggggaa 1560 
ggtggcacgg gategcttte ctctgaacgc ttctcgctgc tetttgagee tgcagacacc 162 0 
tgggggatac ggggecgegg atccgctcac ggggacagcc cccccccaaa gcccccaggg 168 0 
atgtaattac gtccctcccc cgctaggggg cagcagcgag ccgcccgggg ctccgctccg 1740 
gtccggcgct ccccccgcat ccccgagccg gcagcgtgcg gggacagccc gggcacgggg 18 00 
aaggtggcac gggatcgctt tcctctgaac gcttctcgct gctctttgag cctgcagaca 18 60 
cctgggggat aeggggeggg ggatccacta gttattaata gtaatcaatt aeggggtcat 1920 
tagttcatag cccatatatg gagttccgcg ttacataact tacggtaaat ggcccgcctg 19 80 
gctgaccgcc caacgacccc cgcccattga cgtcaataat gacgtatgtt cccatagtaa 2 04 0 
cgecaatagg gactttccat tgacgtcaat gggtggacta tttacggtaa actgcccact 2100 
tggcagtaca tcaagtgtat catatgecaa gtacgccccc tattgaegtc aatgacggta 2160 
aatggcccgc ctggcattat gcccagtaca tgaccttatg ggactttcct acttggcagt 222 0 
acatctacgt attagtcatc gctattacca tgggtcgagg tgagccccac gttctgette 22 80 
actctcccca tctccccccc ctccccaccc ccaattttgt atttatttat tttttaatta 2340 
ttttgtgcag cgatgggggc gggggggggg ggggcgcgcg ecaggegggg eggggegggg 24 0 0 
egaggggegg ggeggggega ggeggagagg tgeggeggea gecaatcaga gcggcgcgct 24 60 
ccgaaagttt ccttttatgg egaggeggeg geggeggegg ccctataaaa agegaagege 2520 
gcggcgggcg ggagtcgctg cgttgccttc gccccgtgcc ccgctccgcg ccgcctcgcg 2580 
ccgcccgccc cggctctgac tgaccgcgtt actcccacag gtgageggge gggacggccc 2 640 
ttctcctccg ggctgtaatt agcgcttggt ttaatgaegg ctegtttett ttctgtggct 270 0 
gcgtgaaagc cttaaagggc teegggaggg ccctttgtgc gggggggagc ggctcggggg 2760 
gtgcgtgcgt gtgtgtgtgc gtggggagcg ccgcgtgcgg cccgcgctgc ccggcggctg 2 82 0 
tgagcgctgc gggcgcggcg eggggctttg tgcgctccgc gtgtgcgcga ggggagcgcg 28 80 
geegggggeg gtgccccgcg gtgcgggggg getgegaggg gaacaaaggc tgcgtgcggg 294 0 
gtgtgtgcgt gggggggtga gcagggggtg tgggegegge ggtcgggctg taaccccccc 3 0 00 
ctgcaccccc ctccccgagt tgctgagcac ggcccggctt cgggtgcggg gctccgtgcg 3 0 60 
gggcgtggcg cggggctcgc cgtgccgggc ggggggtggc ggcaggtggg ggtgccgggc 312 0 
ggggegggge cgcctcgggc eggggaggge tegggggagg ggegeggegg ccccggagcg 3180 
ccggcggctg tegaggegeg gcgagccgca gccattgcct tttatggtaa tegtgegaga 324 0 
gggcgcaggg acttcctttg tcccaaatct ggcggagccg aaatctggga ggcgccgccg 33 0 0 
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caccccctct 
gagggccttc 
cgcaggggga 
tgaccggcgg 
tcctgggcaa 
tggtgagcaa 
gcgacgtaaa 
gcaagctgac 
tcgtgaccac 
agcacgactt 
tcaaggacga 
tgaaccgcat 
agctggagta 
gcatcaaggt 
accactacca 
accfcgagcac 
tgctggagtt 
aattcactcc 
tggctcacaa 
agccccttga 
gttggaattt 
atcagaatga 
caaaggtggc 
ttccatagaa 
ttttctttaa 
tcctgactac 
cccaagcttg 
gtctgcaggc 
cccgtgcccg 
ggagcggagc 
cctgggggct 
gtgtctgcag 
tccccgtgcc 
ccggagcgga 
tccctggggg 
aggtgtctgc 
cttccccgtg 
gaccggagcg 
catccctggg 
ccaggtgtct 
accttccccg 
cgga c cggag 
tacatccctg 
ccccaggtgt 
ccaccttccc 
gccggaccgg 
attacatccc 
tcccccaggt 
tgccaccttc 
gcgccggacc 
taattacatc 
caggaattcg 
tccacacaac 
ctaactcaca 
ccagctgcat 
ttccgcttcc 
agctcactca 
catgtgagca 
tttccatagg 
gcgaaacccg 
ctctcctgtt 
cgtggcgctt 
caagctgggc 
ctatcgtctt 
taacaggatt 
taactacggc 
cttcggaaaa 



agcgggcgcg 
gtgcgtcgcc 
cggctgcctt 
ctctagagcc 
cgtgctggtt 
gggcgaggag 
cggccacaag 
cctgaagttc 
cctgacctac 
cttcaagtcc 
cggcaactac 
cgagctgaag 
caactacaac 
gaacttcaag 
gcagaacacc 
ccagtccgcc 
cgtgaccgcc 
tcaggtgcag 
ataccactga 
gcatctgact 
tttgtgtctc 
gtatttggtt 
tataaagagg 
aagccttgac 
catccctaaa 
tcccagtcat 
catgcctgca 
tcaaagagca 
ggctgtcccc 
cccgggcggc 
ttgggggggg 

gctcaaagag 
cgggctgtcc 
gccccgggcg 
ctttgggggg 
aggctcaaag 
cccgggctgt 
gagccccggg 
ggctttgggg 
gcaggctcaa 
tgcccgggct 
cggagccccg 

ggggctttgg 

ctgcaggctc 
cgtgcccggg 
agcggagccc 

tgggggcttt 

gtctgcaggc 
cccgtgcccg 
ggagcggagc 
cctgggggct 
taatcatggt 
atacgagccg 
ttaattgcgt 
taatgaatcg 
tcgctcactg 
aaggcggtaa 
aaaggccagc 
ctccgccccc 
acaggactat 
ccgac.cctgc 
tctcatagct 
tgtgtgcacg 
gagtccaacc 
agcagagcga 
tacactagaa 
agagttggta 



ggcgaagcgg 
gcgccgccgt 
cgggggggac 
tctgctaacc 
gttgtgctgt 
ctgttcaccg 
ttcagcgtgt 
atctgcacca 
ggcgtgcagt 
gccatgcccg 
aagacccgcg 
ggcatcgact 
agccacaacg 
atccgccaca 
cccatcggcg 
ctgagcaaag 
gccgggatca 
gctgcctatc 
gatctttttc 
tctggctaat 
tcactcggaa 
tagagtttgg 
tcatcagtat 
ttgaggttag 
attttcctta 
agctgtccct 
ggtcgactct 
gcgagaagcg 
gcacgctgcc 
tcgctgctgc 
gctgtccccg 
cagcgagaag 
ccgcacgctg 
gctcgctgct 
gggctgtccc 
agcagcgaga 
ccccgcacgc 
cggctcgctg 

gggggctgtc 

agagcagcga 
gtccccgcac 
ggcggctcgc 

gggggggctg 

aaagagcagc 
ctgtccccgc 
cgggcggcfcc 
gggggggggc 

tcaaagagca 
ggctgtcccc 
cccgggcggc 
ttgggggggg 
catagctgtt 
gaagcataaa 
tgcgctcact 
gccaacgcgc 
actcgctgcg 
tacggttatc 
aaaaggccag 
ctgacgagca 
aaagatacca 
cgcttaccgg 
cacgctgtag 
aaccccccgt 
cggtaagaca 
ggtatgtagg 
ggacagtatt 
gctcttgatc 



tgcggcgccg 
ccccttctcc 

ggggcagggc 

atgttcatgc 
ctcatcattt 
gggtggtgcc 
ccggcgaggg 
ccggcaagct 
gcttcagccg 
aaggctacgt 
ccgaggtgaa 
tcaaggagga 
tctatatcat 
acatcgagga 
acggccccgt 
accccaacga 
ctctcggcat 
agaaggtggfc 
cctctgccaa 
aaaggaaatt 
ggacatatgg 
caacatatgc 
atgaaacagc 
atttttttta 
catgttttac 
cttctcttat 
agtggatccc 
ttcagaggaa 
ggctcgggga 
cccctagcgg 
tgagcggatc 
cgttcagagg 
ccggctcggg 
gccccctagc 
cgtgagcgga 
agcgttcaga 
tgccggctcg 
ctgcccccta 
cccgtgagcg 
gaagcgttca 
gctgccggct 
tgctgccccc 
tccccgtgag 
gagaagcgtt 
acgctgccgg 
gctgctgccc 
tgtccccgtg 
gcgagaagcg 
gcacgctgcc 
tcgctgctgc 
gctgtccccg 
tcctgtgtga 
gtgtaaagcc 
gcccgctttc 
ggggagaggc 
ctcggtcgtt 
cacagaatca 
gaaccgtaaa 
tcacaaaaat 
ggcgtttccc 
atacctgtcc 
gtatctcagt 
tcagcccgac 
cgacttatcg 

c ggtgctaca 

tggtatctgc 
cggcaaacaa 



gcaggaagga 
atctccagcc 

ggggttcggc 

cttcttcttt 
tggcaaagaa 
catcctggtc 
cgagggcgat 
gcccgtgccc 
ctaccccgac 
ccaggagcgc 
gttcgagggc 
cggcaacatc 
ggccgacaag 
cggcagcgtg 
gctgctgccc 
gaagcgcgat 
ggacgagctg 
ggctggtgtg 
aaattatggg 
tattttcatt 
gagggcaaat 
catatgctgg 
cccctgctgt 
tattttgttt 
tagccagatt 
gaagatccct 
ccgccccgta 
agcgatcccg 
tgcgggggga 
gggagggacg 
cgcggccccg 
aaagcgatcc 
gatgcggggg 
gggggaggga 

tccgcggccc 
ggaaagcgat 
gggatgcggg 
gcgggggagg 
gatccgcggc 
gaggaaagcg 

cggggatgcg 
tagcggggga 
cggatccgcg 
cagaggaaag 
ctcggggatg 
cc tagcgggg 
agcggatccg 
ttcagaggaa 
ggctcgggga 
cccctagcgg 
tgagcggatc 
aattgttatc 
tggggtgcct 
cagtcgggaa 
ggtttgcgta 
cggctgcggc 
ggggataacg 
aaggccgcgt 
cgacgctcaa 
cctggaagct 
gcctttctcc 
tcggtgtagg 
cgctgcgcct 
ccactggcag 
gagttcttga 
gctctgctga 
accaccgctg 



aatgggcggg 
tcggggctgc 
ttctggcgtg 
ttcctacagc 
ttcgccacca 
gagctggacg 
gccacctacg 
tggcccaccc 
cacatgaagc 
accatcttct 
gacaccctgg 
ctggggcaca 
cagaagaacg 
cagctcgccg 
gacaaccact 
cacatggtcc 
tacaagtaag 
gccaatgccc 
gacatcatga 
gcaatagtgt 
catttaaaac 
ctgccatgaa 
ccattcctta 
tgtgttattt 
tttcctcctc 
cgacctgcag 
tcccccaggt 
tgccaccttc 
gcgccggacc 
taattacatc 
tatcccccag 
cgtgccacct 
gagcgccgga 
cgtaattaca 
cgtatccccc 
cccgtgccac 
gggagcgccg 
gacgtaatta 
cccgtatccc 
atcccgtgcc 

gggggagcgc 

gggacgtaat 
gccccgtatc 
cgatcccgtg 
cggggggagc 
gagggacgta 
cggccccgta 
agcga t cccg 
tgcgggggga 
gggagggacg 
cgcggggctg 
cgctcacaat 
aatgagtgag 
acctgtcgtg 
ttgggcgctc 
gage ggt ate 
c aggaaagaa 
tgctggcgtt 
gtcagaggtg 
ccctcgtgcg 
ettegggaag 
tcgttcgctc 
tatceggtaa 
cagccactgg 
agtggtggcc 
agecagttae 
gtagcggtgg 



3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 
5100 
5160 
5220 
5280 
5340 
5400 
5460 
5520 
5580 
5640 
5700 
5760 
5820 
5880 
5940 
6000 
6060 
6120 
6180 
6240 
6300 
6360 
6420 
6480 
6540 
6600 
6660 
6720 
6780 
6840 
6900 
6960 
7020 
7O80 
7140 
7200 
7260 
7320 



-73- 



tttttttgtt tgcaagcagc agattacgcg 
gatcttttct acggggtctg acgctcagtg 
catgagatta tcaaaaagga tcttcaccta 
atcaatctaa agtatatatg agtaaacttg 
ggcacctatc tcagcgatct gtctatttcg 
gtagataact acgatacggg agggcttacc 
agacccacgc tcaccggctc cagatttatc 
gcgcagaagt ggtcctgcaa ctttatccgc 
agctagagta agtagttcgc cagttaatag 
catcgtggtg tcacgctcgt cgtttggtat 
aaggcgagtt acatgatccc ccatgttgtg 
gatcgttgtc agaagtaagt tggccgcagt 
taattctctt actgtcatgc catccgtaag 
caagtcattc tgagaatagt gtatgcggcg 
ggataatacc gcgccacata gcagaacttt 
ggggcgaaaa ctctcaagga tcttaccgct 
tgcacccaac tgatcttcag catcttttac 
aggaaggcaa aatgccgcaa aaaagggaat 
actcttcctt tttcaatatt attgaagcat 
catatttgaa tgtatttaga aaaataaaca 
agtgccacct gacgtagtta acaaaaaaaa 
aagcgccatt cgccattcag gctgcgcaac 
tcgctattac gccagctggc gaaaggggga 
ccagggtttt cccagtcacg acgttgtaaa 
aaggccttga ctagagggtc gacggtatac 
acaaaccaca actagaatgc agtgaaaaaa 
tgctttattt gtaaccatta taagctgcaa 
gcatgagatc cccgcgctgg aggatcatcc 
ccaacctttc atagaaggcg gcggtggaat 
cctcggccac gaagtgcacg 

<210> 111 
<211> 4223 
<212> DNA 

<213> Artificial Sequence 



cagaaaaaaa ggatctcaag aagatccttt 73 80 
gaacgaaaac tcacgttaag ggattttggt 7440 
gatcctttta aattaaaaat gaagttttaa 7500 
gtctgacagt taccaatgct taatcagtga 7560 
ttcatccata gttgcctgac tccccgtcgt 762 0 
atctggcccc agtgctgcaa tgataccgcg 7680 
agcaataaac cagccagccg gaagggccga 774 0 
ctccatccag tctattaatt gttgccggga 7800 
tttgcgcaac gttgttgcca ttgctacagg 786 0 
ggcttcattc agctccggtt cccaacgatc 792 0 
caaaaaagcg gttagctcct tcggtcctcc 7980 
gttatcactc atggttatgg cagcactgca 804 0 
atgcttttct gtgactggtg agtactcaac 810 0 
accgagttgc tcttgcccgg cgtcaatacg 8160 
aaaagtgctc atcattggaa aacgttcttc 822 0 
gttgagatcc agttcgatgt aacccactcg 8280 
ttfccaccagc gtttctgggt gagcaaaaac 834 0 
aagggcgaca cggaaatgtt gaatactcat 8400 
ttatcagggt tattgtctca tgagcggata 846 0 
aataggggtt ccgcgcacat ttccccgaaa 8520 
gcccgccgaa gcgggcttta ttaccaagcg 8580 
tgttgggaag ggcgatcggt gcgggcctct 8640 
tgtgctgcaa ggcgattaag ttgggtaacg 8700 
acgacggcca gtccgtaata cgactcactt 8760 
agacatgata agatacattg afcgagtttgg 8820 
atgctttatt tgtgaaattt gtgatgctat 8880 
taaacaagtt ggggtgggcg aagaactcca 8940 
agccggcgtc ccggaaaacg attccgaagc 900 0 
cqaaatctcg tagcacgtgt cagtcctgct 9060 

9080 



<220> 

<223> pLIT38at:tBBSRpolyA10 Plasmid 



<400> 111 

gttaactacg 

tttctaaata 

ataatattga 

ttttgcggca 

tgctgaagat 

gatccttgag 

gctatgtggc 

acactattct 

tggcatgaca 

caacttactt 

gggggatcat 

cgacgagcgt 

tggcgaacta 

agttgcagga 

tggagccggt 

ctcccgtatc 

acagatcgct 

ctcatatata 

aagattgtat 

aatttttgtt 

aaatcaaaag 

ctattaaaga 

ccactacgtg 

aatcggaacc 

gaaaggaagg 

cgctgcgcgt 

atctaggtga 



tcaggtggca 
cattcaaata 
aaaaggaaga 
ttttgccttc 
cagttgggtg 
agttttcgcc 
gcggtattat 
cagaatgact 
gtaagagaat 
ctgacaacga 
gtaactcgcc 
gacaccacga 
cttactctag 
ccacttctgc 
gagcgtgggt 
gtagttatct 
gagataggtg 
ctttagattg 
aagcaaa t at 
aaatcagctc 
aatagcccga 
acgtggactc 
aaccatcacc 
ct aaagggag 
gaagaaagcg 
aaccaccaca 
agatcctttt 



cttttcgggg 
tgtatccgct 
gtatgagtat 
ctgtttttgc 
cacgagtggg 
ccgaagaacg 
cccgtgttga 
tggttgagta 
tatgcagtgc 
t cggaggacc 
ttgatcgttg 
tgcctgtagc 
cttcccggca 
gctcggccct 
ctcgcggtat 
acacgacggg 
cctcactgat 
atttaccccg 
ttaaattgta 
attttttaac 
gatagggttg 
caacgtcaaa 
caaatcaagt 
cccccgattt 
aaaggagcgg 
cccgccgcgc 
tgataatctc 



aaatgtgcgc 
catgagacaa 
tcaacatttc 
tcacccagaa 
ttacatcgaa 
ttctccaatg 
cgccgggcaa 
ctcaccagtc 
tgccataacc 
gaaggagcta 
ggaaccggag 
aatggcaaca 
acaattaata 
tccggctggc 
cattgcagca 
gagtcaggca 
taagcattgg 
gttgataatc 
aacgttaata 
caataggccg 
agtgttgttc 
gggcgaaaaa 
tttttggggt 
agagcttgac 
gcgctagggc 
ttaatgcgcc 
atgaccaaaa 



ggaaccccta 
taaccctgat 
cgtgtcgccc 
acgctggtga 
ctggatctca 
atgagcactt 
gagcaactcg 
acagaaaagc 
atgagtgata 
accgcttttt 
ctgaatgaag 
acgttgcgca 
gactggatgg 
tggtttattg 
ctggggccag 
actatggatg 
taactgtcag 
agaaaagccc 
ttttgttaaa 
aaatcggcaa 
cagtttggaa 
ccgtctatca 
cgaggtgccg 
ggggaaagcg 
gctggcaagt 
gctacagggc 
tcccttaacg 



tttgtttatt 
aaatgcttca 
ttattccctt 
aagtaaaaga 
acagcggtaa 
ttaaagttct 
gtcgccgcat 
atcttacgga 
acactgcggc 
tgcacaacat 
ccataccaaa 
aactattaac 
aggcggataa 
ctgataaatc 
atggtaagcc 
aacgaaatag 
accaagttta 
caaaaacagg 
attcgcgtta 
aatcccttat 
caagagtcca 

gggcgatggc 

taaagcacta 
aacgtggcga 
gtagcggtca 
gcgtaaaagg 
tgagttttcg 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

120O 

1260 

1320 

1380 

1440 

1500 

1560 

1620 
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ttccactgag 
ctgcgcgtaa 
ccggatcaag 
ccaaatactg 
ccgcctacat 
tcgtgtctta 
tgaacggggg 
tacctacagc 
tatccggtaa 
gcctggtatc 
tgatgctcgt 
ttcctggcct 
accccaggct 
acaatttcac 
ctagtggggc 
tgctttttta 
aacaagatct 
ataaacatca 
atattgaagc 
cagtttcgaa 
acgaagtaga 
cagactatgc 
cgattgaaga 
ttggctgctg 
gcatccagga 
gaggcacgat 
cataattgga 
gtgtataatg 
gaactgatga 
aagaaatgcc 
aaaagaagag 
gtcatgctgt 
aagctgcact 
ataacagtta 
ctattaataa 
ataaggaata 
gtagaggttt 
atgaatgcaa 
aatagcatca 
gtctccggat 
cggactggcc 
tcgccttgca 
tcgcccttcc 
cccgcttcgg 



cgtcagaccc 
tctgctgctt 
agctaccaac 
ttcttctagt 
acctcgctct 
ccgggttgga 
gttcgtgcac 
gtgagctatg 
gcggcagggt 
tttatagtcc 
caggggggcg 
tttgctggcc 
ttacacttta 
acaggaaaca 
ccgtgcaatt 
tactaacttg 
agaattagta 
tgtgggagcg 
gtatatagga 
tggacaaaag 
tagaagtatt 
accagattgt 
actcattcca 
cctgaggctg 
aaccagcagc 
ggccgctttg 
caaactacct 
tgttaaacta 
atgggagcag 
atctagtgat 
aaaggtagaa 
gtttagtaat 
gctatacaag 
taatcataac 
ctatgctcaa 
tttgatgtat 
tacttgcttt 
ttgttgttgt 
caaatttcac 
gtacaggcat 
gtcgttttac 
gcacatcccc 
caacagtfcgc 
cgggcttttt 



cgtagaaaag 
gcaaacaaaa 
tctttttccg 
gtagccgtag 
gctaatcctg 
ctcaagacga 
acagcccagc 
agaaagcgcc 
cggaacagga 
tgtcgggttt 
gagcctatgg 
ttttgctcac 
tgcttccggc 
gc t atgac ca 
gaagccggct 
agcgaaatct 
gaagtagcga 
gcaattcgta 
cgagt aac tg 
gattttgaca 
cgagtggtaa 
fcttgtgttaa 
ctcaaatata 
gacgacctcg 
ggctatccgc 
gtccggatct 
acagagattt 
ctgattctaa 
tggtggaatg 
gatgaggcta 
gac c c c aagg 
agaactcttg 
aaaattatgg 
atactgtttt 
aaattgtgta 
agtgccttga 
aaaaaacctc 
taacttgttt 
aaataaagat 
gcgtcgaccc 
aacgtcgtga 
ctttcgccag 
gcagcctgaa 
ttt 



atcaaaggat 
aaaccaccgc 
aaggtaactg 
ttaggccacc 
ttaccagtgg 
tagttaccgg 
ttggagcgaa 
acgcttcccg 
gagcgcacga 
cgccacctct 
aaaaacgcca 
atgtaatgtg 
tcgtatgttg 
tgattacgcc 
ggcgccaagc 
ggatcaccat 
cagagaagat 
cgaaaacagg 
tttgtgcaga 
cgattgtagc 
gtccttgtgg 
tagaaatgaa 
cccgaaatta 
cggagttcta 
gcat ccatgc 
ttgtgaagga 
aaagctctaa 
ttgtttgtgt 
cctttaatga 
ctgctgactc 
actttccttc 
cttgctttgc 
aaaaatattc 
ttcttactcc 
cctttagctt 
ctagagatca 
ccacacctcc 
attgcagctt 
ccacgaattc 
tctagtcaag 
ctgggaaaac 
ctggcgtaat 
tggcgaatgg 



cttcttgaga 
taccagcggt 
gcttcagcag 
acttcaagaa 
ctgctgccag 
ataaggcgca 
cgacctacac 
aagggagaaa 

gggagcttcc 
gacttgagcg 
gcaacgcggc 
agttagctca 
tgtggaattg 
aagctacgta 
ttctctgcag 
gaaaacattt 
tacaatgctt 
agaaatcatt 
agccat tgcg 
tgttagacac 
tatgtgtagg 
tggcaagtta 
aaagttttac 
ccggcagtgc 
ccccgaactg 
accttacttc 
ggtaaatata 
attttagatt 
ggaaaacctg 
tcaacattct 
agaattgcta 
tatttacacc 
tgtaaccttt 
acacaggcat 
tttaatttgt 
taatcagcca 
ccctgaacct 
ataatggtta 
gctagcttcg 
gccttaagtg 
cctggcgtta 
agcgaagagg 
cgcttcgctt 



tccttttttt 
ggtttgtttg 
agcgcagata 
ctctgtagca 
tggcgataag 
gcggtcgggc 
cgaactgaga 
ggcggacagg 
agggggaaac 
tcgatttttg 
ctttttacgg 
ctcattaggc 
tgagcggata 
atacgactca 
gattgaagcc 
aacatttctc 
tatgaggata 
tcggcagtac 
attggtagtg 
ccttattctg 
gag t 1 gat 1 t 
gtcaaaacta 
cataccaagc 
aaat ccg t eg 
caggagtggg 
tgtggtgtga 
aaatttttaa 
ccaacctatg 
ttttgctcag 
actcctccaa 
agttttttga 
acaaaggaaa 
ataagtaggc 
agagtgtctg 
aaaggggt t a 
taccacattt 
gaaacataaa 
caaataaagc 
gccgtgacgc 
agtegtatta 
cccaacttaa 
cccgcaccga 
ggtaataaag 



X680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4223 



<2X0> 112 
<211> 5855 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pCX-LamlntR Plasmid 



<4O0> 112 

gtcgacattg 

geccatatat 

ccaacgaccc 

ggactttcca 

atcaagtgta 

cctggcatta 

tattagtcat 

atctcccccc 

gcgatggggg 

gggcggggcg 

tccttttatg 

gggagtcget 

ccggctctga 



attattgact 
ggagttccgc 
ccgcccattg 
ttgacgtcaa 
teatatgeca 
tgcccagtac 
cgctattacc 
cctccccacc 

eggggggggg 
aggeggagag 
gegaggegge 
gcgttgcctt 
ctgaccgcgt 



agttattaat 
gttacataac 
aegtcaataa 
tgggtggact 
agtacgcccc 
atgaccttat 
atgggtcgag 
cccaattttg 

gggggegege 

gtgeggegge 
ggcggcggcg 
cgccccgtgc 
tactcccaca 



agtaatcaat 
ttacggtaaa 
tgacgtatgt 
atttaeggta 
etattgaegt 
gggactttcc 
gtgagcccca 
tatttattta 
gecaggeggg 
agecaatcag 
gecctataaa 
cccgctccgc 
ggtgagcggg 



taeggggtea 
tggcccgcct 
tcccatagta 
aactgcccac 
caatgaeggt 
tacttggcag 
cgttctgctt 
ttttttaatt 
gcggggcggg 
agcggcgcgc 
aagcgaagcg 
gccgcctcgc 
cgggacggcc 



ttagttcata 
ggctgaccgc 
aegecaatag 
ttggcagtac 
aaatggcccg 
tacatctacg 
cactctcccc 
attttgtgca 
gcgaggggcg 
tccgaaagtt 
cgcggcgggc 
gccgcccgcc 
cttctcctcc 



60 

120 

180 

240 

300 

360 

420 

480 

540 

6O0 

660 

720 

780 




gggctgtaat tagcgcttgg tttaatgacg 
ccttaaaggg ctccgggagg gcccttfcgtg 
tgtgfcgtgtg cgtggggagc gccgcgtgcg 
cgggcgcggc gcggggcttfc gtgcgctccg 
ggtgccccgc ggtgcggggg ggctgcgagg 
tgggggggtg agcagggggt gtgggcgcgg 
cctccccgag ttgctgagca cggcccggct 
gcggggctcg ccgtgccggg cggggggtgg 
ccgcctcggg ccggggaggg ctcgggggag 
gtcgaggcgc ggcgagccgc agccattgcc 
gacttccttfc gtcccaaatc tggcggagcc 
tagcgggcgc gggcgaagcg gtgcggcgcc 
cgtgcgtcgc cgcgccgccg tccccttctc 
acggctgcct tcggggggga cggggcaggg 
gctctagagc ctctgctaac catgttcatg 
acgtgctggt tgttgtgctg tctcatcatt 
gtcatgagcg ccgggattta ccccctaacc 
acagggaccc aaggacgggt aaagagtttg 
ctgaagctat acaggccaac attgagtfcat 
cgagaatcaa cagtgataat tccgttacgt 
tcctggccag cagaggaatc aagcagaaga 
caataaggag gggtctgcct gatgctccac 
caatgctcaa tggatacata gacgagggca 
cactgagcga tgcatfcccga gaggcaatag 
ctgccactcg cgcagcaaaa tctagagtaa 
tgaaaattta tcaagcagca gaatcatcac 
ctgttgttac cgggcaacga gttggtgatt 
atggatatct fcfcatgtcgag caaagcaaaa 
tgcatattga tgctctcgga atatcaatga 
ttggcggaga aaccataatt gcatctactc 
caaggtattt tatgcgcgca cgaaaagcat 
cctttcacga gttgcgcagt ttgtctgcaa 
ttgctcaaca tcttctcggg cataagtcgg 
gaggcaggga gtgggacaaa attgaaatca 
cctatcagaa ggtggtggct ggtgtggcca 
ttttfcccctc tgccaaaaat tatggggaca 
gctaataaag gaaatttatt ttcattgcaa 
tcggaaggac atatgggagg gcaaatcatt 
gtttggcaac atatgccata tgctggctgc 
cagtatatga aacagccccc tgctgtccat 
ggttagattt tttttatatt ttgttttgtg 
tccttacatg ttttactagc cagatttttc 
gtccctcttc tcttatgaag atccctcgac 
atagctgttt cctgtgtgaa attgttatcc 
aagcataaag tgtaaagcct ggggtgccta 
gcgctcactg cccgctttcc agtcgggaaa 
tagtcagcaa ccatagtccc gcccctaact 
tccgcccatt ctccgcccca tggctgacta 
gcctcggcct ctgagctatt ccagaagtag 
tgcaaaaagc taacttgttt attgcagctt 
caaatttcac aaataaagca tttttttcac 
tcaatgtatc ttatcatgtc tggatccgct 
aggcggtttg cgtattgggc gctcttccgc 
cgttcggctg cggcgagcgg tat cage tea 
atcaggggat aacgeaggaa agaacatgtg 
fcaaaaaggee gcgttgctgg cgtttttcca 
aaatcgaege tcaagtcaga ggtggcgaaa 
tccccctgga agctccctcg tgcgctctcc 
gtccgccttt ctcccttcgg gaagcgtggc 
cagttcggtg taggtegtte gctccaagct 
cgaccgctgc gccttatccg gtaactatcg 
atcgccactg gcagcagcca ctggtaacag 
tacagagttc ttgaagtggt ggectaacta 
ctgcgctctg ctgaagccag ttaccttegg 
acaaaccacc gctggtagcg gtggtttttt 
aaaaggatct caagaagatc ctttgatctt 
aaactcacgt taagggattt tggtcatgag 
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getegtttet tttctgtggc tgcgtgaaag 84 0 
egggggggag eggctegggg ggtgcgtgcg 90 0 
gcccgcgctg cccggcggct gtgagcgctg 96 0 
cgtgtgcgcg aggggagege ggceggggge 102 0 
ggaacaaagg ctgcgtgcgg ggtgtgtgcg 10 80 
eggteggget gtaacccccc cctgcacccc 1140 
tegggtgegg ggctccgtgc ggggcgtggc 12 00 
cggcaggtgg gggtgccggg eggggegggg 12 60 
gggcgcggcg gccccggagc gccggcggct 13 2 O 
ttttatggta ategtgegag agggegcagg 13 80 
gaaatctggg aggcgccgcc gcaccccctc 144 0 
ggcaggaagg aaatgggcgg ggagggcett 15 0 0 
catctccagc cteggggctg ccgcaggggg 1560 
eggggttegg cttctggcgt gtgaccggcg 16 2 O 
ccttcttctt tttcctacag ctcctgggca 16 80 
ttggcaaaga attcatggga agaaggegaa 1740 
tttatataag aaacaatgga tattactget 1800 
gattaggcag agacaggega ategcaatea 18 60 
tttcaggaca caaacacaag cctctgacag 192 0 
tacattcatg gettgatege tacgaaaaaa 1980 
cactcataaa ttacatgagc aaaattaaag 2 04 0 
ttgaagacat caccacaaaa gaaattgegg 210 0 
aggeggegtc agecaagtta atcagatcaa 216 0 
ctgaaggeca tataacaaca aaccatgtcg 222 0 
ggagatcaag acttaegget gacgaatacc 22 80 
catgttggct cagacttgea atggaactgg 234 0 
tatgegaaat gaagtggtct gatategtag 24 0 0 
caggegtaaa aattgecate ccaacagcat 24 6 0 
aggaaacact tgataaatgc aaagagatfcc 25 2 0 
gtcgcgaacc gctttcatcc ggcacagtat 2 5 80 
caggtctttc cttcgaaggg gatccgccta 264 0 
gactctatga gaagcagata agegataagt 27 00 
acaccatggc atcacagtat cgtgatgaca 2760 
aataagaatt cactcctcag gtgcaggcfcg 2 82 0 
atgccctggc tcacaaatac cactgagatc 2 8 80 
teatgaagee ccttgagcat ctgacttctg 2 94 0 
tagtgtgttg gaattttttg tgtctctcac 3 0 00 
taaaacatca gaatgagtat ttggtttaga 3 0 60 
catgaacaaa ggtggctata aagaggtcat 3120 
tccttattcc atagaaaagc cttgacttga 3180 
ttattttttt ctttaacatc cctaaaattt 3240 
ctcctctcct gactactccc agtcatagct 33 00 
ctgcagccca agcttggcgt aatcatggtc 33 60 
gctcacaatt ccacacaaca tacgagcegg 3420 
atgagtgagc taactcacat taattgcgtt 34 80 
cctgtcgtgc cagcggatcc gcatctcaat 3 540 
ccgcccatcc cgcccctaac tccgcccagt 3 600 
attfcttttta tttatgeaga ggccgaggcc 3 660 
tgaggaggct tttttggagg cctaggcttt 3720 
ataatggtta caaataaagc aatagcatca 3 7 80 
tgcattctag ttgtggtttg tccaaactca 3 84 0 
gcattaatga atcggccaac gegeggggag 3 90 0 
ttcctcgctc actgactcgc tgcgctcggt 3 960 
etcaaaggeg gtaatacggt tatccacaga 4 02 0 
agcaaaaggc cagcaaaagg ccaggaaccg 4 080 
taggctccgc ccccctgacg agcatcacaa 4140 
cccgacagga ctataaagat accaggegtt 42 OO 
tgttccgacc ctgccgctta ccggatacct 42 60 
gctttctcaa tgctcacgct gtaggtatct 432 0 
gggctgtgtg cacgaacccc ccgttcagcc 43 8 0 
tcttgagtcc aacceggtaa gacacgactt 444 O 
gattagcaga gcgaggtatg taggcggtgc 45 0 0 
cggctacact agaaggacag tatttggtat 4560 
aaaaagagtt ggtagctctt gatceggcaa 4 62 0 
tgtttgcaag cagcagatta cgcgcagaaa 4 6 80 
ttctacgggg tctgacgctc' agtggaacga 47 40 
attatcaaaa aggatcttca cctagatcct 4800 
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tttaaattaa 
cagttaccaa 
catagttgcc 
ccccagtgct 
aaaccagcca 
ccagtctatt 
caacgttgtt 
attcagctcc 
agcggttagc 
actcatggtt 
ttctgtgact 
ttgctcttgc 
gctcatcatt 
atccagttcg 
cagcgtttct 
gacacggaaa 
gggttattgt 
ggttccgcgc 



aaatgaagt t 
tgcttaatca 
tgacfcccccg 
gcaatgatac 
gccggaaggg 
aatfcgttgcc 
gccattgcta 
ggttcccaac 
tccttcggtc 
atggcagcac 
ggtgagtact 
ccggcgtcaa 
ggaaaacgtt 
atgtaaccca 

gggtgagcaa 
tgttgaatac 
ctcatgagcg 
acatttcccc 



ttaaatcaat 
gtgaggcacc 
tcgtgtagat 
cgcgagaccc 
ccgagcgcag 
gggaagctag 
caggcatcgt 
gatcaaggcg 
ctccgatcgt 
tgcataattc 
caaccaagtc 
tacgggataa 
cttcggggcg 
ctcgtgcacc 
aaacaggaag 
tcatactctt 
gatacatatt 
gaaaagtgcc 



ctaaagtata 
tatctcagcg 
aactacgata 
acgctcaccg 
aagtggtcct 
agtaagtagt 
ggtgtcacgc 
agttacatga 
tgtcagaagt 
tcttactgtc 
attctgagaa 
taccgcgcca 
aaaactctca 
caactgatct 
gcaaaatgcc 
cctttttcaa 
tgaatgtatt 
acctg 



tatgagtaaa 
atctgtctat 
cgggagggct 
gctccagatt 
gcaactttat 
tcgccagtta 
tcgtcgtttg 
tcccccatgt 
aagttggccg 
atgccatccg 
tagtgtatgc 
catagcagaa 
aggatcttac 
tcagcatctfc 
gcaaaaaagg 
tattattgaa 
tagaaaaata 



cttggtctga 
ttcgttcatc 
taccatctgg 
tatcagcaat 
ccgcctccat 
atagtttgcg 
gtatggcttc 
tgtgcaaaaa 
cagtgttatc 
taagatgctt 
ggcgaccgag 
ctttaaaagt 
cgctgtfcgag 
ttactttcac 
gaat aagggc 
gcatttatca 
aacaaatagg 



4860 
4920 
4980 
5040 
5100 
5160 
5220 
5280 
5340 
5400 
5460 
5520 
5580 
5640 
5700 
5760 
5820 
5855 



<210> 113 
<211> 4346 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pSV4 0-193 At tpsensePur Plasmid 



<400> 113 

ccggtgccgc 

atgaccgagt 

cgcaccctcg 

cgccacatcg 

atcggcaagg 

agcgtcgaag 

tcccggctgg 

cccgcgtggt 

agcgccgtcg 

gagacctccg 

gacgtcgagg 

cgcccgcccc 

cgaagc cgac 

gaggatcata 

acacctcccc 

tgcagcttat 

tttttcactg 

gatccgcgcc 

gcttggcgta 

cacacaacat 

aactcacatt 

agctgcatta 

ccgcttcctc 

ctcactcaaa 

tgtgagcaaa 

tccataggct 

gaaacccgac 

ctcctgttcc 

tggcgctttc 

agctgggctg 

atcgtcttga 

acaggattag 

actacggcta 

t c ggaaa a ag 

tttttgtttg 

tcttttctac 

tgagattatc 

caatctaaag 

cacctatctc 



caccatcccc 
ac aagc c c ac 
ccgccgcgtt 
agcgggtcac 
tgtgggtcgc 
cgggggcggt 
ccgcgcagca 
tcctggccac 
tgctccccgg 
cgccccgcaa 
tgcccgaagg 
acgacccgca 
ccgggcggcc 
atcagccata 
ctgaacctga 
aatggttaca 
cattctagtt 
ggatccttaa 
atcatggtca 
acgagccgga 
aat tgcgt tg 
a tgaat cggc 
gctcactgac 
ggcggtaata 
aggccagcaa 
ccgcccccct 
aggactataa 
gaccctgccg 
tcatagctca 
tgtgcacgaa 
gtccaacccg 
cagagcgagg 
cactagaagg 
agttggtagc 
caagcagcag 
ggggtctgac 
aaaaaggatc 
tatatatgag 
agcgatctgt 



tgacccacgc 
ggtgcgcctc 
cgccgactac 
cgagctgcaa 
ggacgacggc 
gttcgccgag 
acagatggaa 
cgtcggcgtc 
agtggaggcg 
cctccccttc 
accgcgcacc 
gcgcccgacc 
ccgccgaccc 
ccacatttgt 
aacataaaat 
aataaagcaa 
gtggtttgtc 
ttaagtctag 
tagctgtttc 
agcataaagt 
cgctcactgc 
caacgcgcgg 
tcgctgcgct 
cggttatcca 
aaggccagga 
gacgagcatc 
agataccagg 
cttaccggat 
cgctgtaggt 
ccccccgttc 
gtaagacacg 
tatgtaggcg 
acagtatttg 
tcttgatccg 
attacgcgca 
gctcagtgga 
ttcacctaga 
taaacttggt 
ctatttcgtt 



ccctgacccc 
gccacccgcg 
cccgccacgc 
gaactcttcc 
gccgcggtgg 
atcggcccgc 
ggcctcctgg 
tcgcccgacc 
gccgagcgcg 
tacgagcggc 
tggtgcatga 
gaaaggagcg 
cgcacccgcc 
agaggtttta 
gaatgcaatt 
tagcatcaca 
caaactcatc 
agtcgactgt 
ctgtgtgaaa 
gtaaagcctg 
ccgctttcca 
ggagaggcgg 
cggtcgttcg 
cagaatcagg 
accgtaaaaa 
acaaaaatcg 
cgtttccccc 
acctgtccgc 
atctcagttc 
agcccgaccg 
acttatcgcc 
gtgctacaga 
gtatctgcgc 
gcaaacaaac 
gaaaaaaagg 
acgaaaactc 
tccttttaaa 
ctgacagtta 
catccatagt 



tcacaaggag 
acgacgtccc 
gccacaccgt 
tcacgcgcgt 
cggtctggac 
gcatggccga 
cgccgcaccg 
accagggcaa 
ccggggtgcc 
tcggcttcac 
cccgcaagcc 
cacgacccca 
cccgaggccc 
cttgctttaa 
gttgttgtta 
aatttcacaa 
aatgtatctt 
ttaaacctgc 
ttgttatccg 

gggtgcctaa 

gtcgggaaac 
tttgcgtatt 
gctgcggcga 
ggataacgca 
ggccgcgttg 
acgctcaagt 
tggaagctcc 
ctttctccct 
ggtgtaggtc 
ctgcgcctta 
ac t ggc agca 
gttcttgaag 
tctgctgaag 
caccgctggt 
atctcaagaa 
acgttaaggg 
ttaaaaatga 
ccaatgctta 
tgcctgactc 



acgaccttcc 
ccgggccgta 
cgacccggac 
cgggctcgac 
cacgccggag 
gttgagcggt 
gcccaaggag 
gggtctgggc 
cgccttcctg 
cgtcaccgcc 
cggtgcctga 
tggctccgac 
accgactcta 
aaaacctccc 
acttgtttat 
ataaagcatt 
atcatgtctg 
aggcatgcaa 
ctcacaattc 
tgagtgagct 
ctgtcgtgcc 
gggcgctctt 
gcggtatcag 
ggaaagaaca 
ctggcgtttt 
cagaggtggc 
ctcgtgcgct 
tcgggaagcg 
gttcgctcca 
tccggtaact 
gccactggta 
tggtggccta 
ccagttacct 
agcggtggtt 
gatcctttga 
attttggtca 
agttttaaat 
atcagtgagg 
cccgtcgtgt 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

84 0 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 
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agataactac gatacgggag ggcttaccat 
acccacgcfcc accggctcca gatttatcag 
gcagaagtgg tcctgcaact ttatccgcct 
ctagagtaag tagttcgcca gttaatagtt 
tcgtggtgtc acgctcgtcg ttfcggfcatgg 
ggcgagttac atgatccccc atgttgtgca 
tcgttgtcag aagtaagttg gccgcagtgt 
attctcttac tgtcatgcca tccgtaagat 
agtcattctg agaatagtgt atgcggcgac 
ataataccgc gccacatagc agaactttaa 
ggcgaaaact ctcaaggatc ttaccgctgt 
cacccaacfcg atcttcagca tcttttactt 
gaaggcaaaa tgccgcaaaa aagggaataa 
tcttcctttt tcaatattat tgaagcattt 
tatttgaatg tatttagaaa aataaacaaa 
tgccacctga cgfcctaagaa accattatta 
tcacgaggcc ctttcgtctc gcgcgtttcg 
agctcccgga gacggtcaca gcttgtctgt 
agggcgcgtc agcgggtgtt ggcgggfcgtc 
agattgtact gagagtgcac catatgcggt 
aataccgcat caggcgccat tcgccattca 
tgcgggcctc ttcgctatta cgccagctgg 
gttgggtaac gccagggttt tcccagtcac 
agctgtggaa tgtgtgtcag ttagggfcgtg 
gtatgcaaag catgcatctc aattagtcag 
cagcaggcag aagtatgcaa agcatgcatc 
taactccgcc catcccgccc ctaactccgc 
gactaatttt ttttatttat gcagaggccg 
agtagfcgagg aggctttttt ggaggctcgg 
tcactaatac catctaagta gttgattcat 
tatgtagtct gttttttatg caaaatctaa 
gtttctcgtt cagctttttc atactaagtt 
tgttgcaacg aacaggtcac tatcagtcaa 
cccactccct gcctctgggg ggcgcg 

<210> 114 
<211> 3166 
<212> DNA 

<213> Artificial Sequence 



ctggccccag tgctgcaatg afcaccgcgag 240 0 
caataaacca gccagccgga agggccgagc 2460 
ccatccagtc tattaattgt tgccgggaag 252 0 
tgcgcaacgt tgttgccatt gctacaggca 2580 
cttcattcag ctccggttcc caacgatcaa 2 64 0 
aaaaagcggt tagctccttc ggtcctccga 2 70 0 
tatcactcat ggttatggca gcactgcata 2760 
gcttttctgt gactggtgag tactcaacca 2 82 0 
cgagttgctc ttgcccggcg tcaatacggg 2 88 0 
aagtgctcat cattggaaaa cgttcttcgg 2 94 0 
tgagatccag ttcgatgtaa cccactcgtg 300 0 
tcaccagcgt ttctgggtga gcaaaaacag 3 060 
gggcgacacg gaaatgttga atactcatac 312 0 
atcagggtta ttgtctcatg agcggataca 3180 
taggggttcc gcgcacattt ccccgaaaag 3240 
tcatgacatt aacctataaa aataggcgta 3300 
gtgatgacgg tgaaaacctc tgacacatgc 3 3 60 
aagcggafcgc cgggagcaga caagcccgtc 3 42 0 
ggggctggct taactatgcg gcatcagagc 34 8 0 
gtgaaatacc gcacagatgc gtaaggagaa 3 54 0 
ggctgcgcaa ctgttgggaa gggcgatcgg 3 600 
cgaaaggggg atgtgctgca aggcgattaa 3 660 
gacgttgtaa aacgacggcc agtgaattcg 3 720 
gaaagtcccc aggctcccca gcaggcagaa 37 80 
caaccaggtg tggaaagtcc ccaggctccc 3 84 0 
tcaattagtc agcaaccata gtcccgcccc 3900 
ccagttccgc ccattctccg ccccatggct 3 960 
aggccgcctc ggcctctgag ctattccaga 4020 
tacccccttg cgctaatgct ctgttacagg 4 0 80 
agtgactgca tatgttgtgt tttacagtat 4140 
tttaatatat tgatatttat atcattttac 42 00 
ggcattataa aaaagcattg cttatcaatt 42 60 
aataaaatca ttatttgatt tcaattttgt 4320 

4346 



<220> 

<223> pl8attBZeo Plasmid 
<400> 114 

cagttgccgg ccgggtcgcg cagggcgaac tcccgccccc acggctgctc gccgatctcg &o 
gtcatggccg gcccggaggc gtcccggaag ttcgtggaca cgacctccga ccactcggcg 12 0 
tacagctcgt ccaggccgcg cacccacacc caggccaggg tgttgtccgg caccacctgg 180 
tcctggaccg cgctgatgaa cagggtcacg tcgtcccgga ccacaccggc gaagtcgtcc 24 0 
tccacgaagt cccgggagaa cccgagccgg tcggtccaga actcgaccgc tccggcgacg 300 
tcgcgcgcgg tgagcaccgg aacggcactg gtcaacttgg ccatggatcc agatttcgct 3 60 
caagttagta taaaaaagca ggcttcaatc ctgcagagaa gcttgcatgc ctgcaggtcg 420 
actctagagg atccccgggt accgagctcg aattcgtaat catggtcata gctgtttcct 4 80 
gtgtgaaatt gttatccgct cacaattcca cacaacatac gagccggaag cataaagtgt 54 0 
aaagcctggg gtgcctaatg agtgagctaa ctcacattaa ttgcgttgcg ctcactgccc 600 
gctttccagt cgggaaacct gtcgtgccag ctgcattaat gaatcggcca acgcgcgggg 660 
agaggcggtt tgcgtattgg gcgctcttcc gcttcctcgc tcactgactc gctgcgctcg 720 
gtcgttcggc tgcggcgagc ggtatcagct cactcaaagg cggtaatacg gttatccaca 780 
gaatcagggg ataacgcagg aaagaacatg tgagcaaaag gccagcaaaa ggccaggaac 840 
cgtaaaaagg ccgcgttgct ggcgtttttc cataggctcc gcccccctga cgagcatcac 900 
aaaaatcgac gctcaagtca gaggtggcga aacccgacag gactataaag ataccaggcg 96 0 
tttccccctg gaagctccct cgtgcgctct cctgttccga ccctgccgct taccggatac 102 0 
ctgtccgcct ttctcccttc gggaagcgtg gcgctttctc atagctcacg ctgtaggtat 1080 
ctcagttcgg tgtaggtcgt tcgctccaag ctgggctgtg tgcacgaacc ccccgttcag 114 0 
cccgaccgct gcgccttatc cggtaactat cgtcttgagt ccaacccggt aagacacgac 12 0 0 
ttatcgccac tggcagcagc cactggtaac aggattagca gagcgaggta tgtaggcggt 1260 
gctacagagt tcttgaagtg gtggcctaac tacggctaca ctagaaggac agtatttggt 132 0 
atctgcgctc tgctgaagcc agttaccttc ggaaaaagag ttggtagctc ttgatccggc 13 8 0 



-78- 



aaacaaacca ccgctggtag cggtggtttt 
aaaaaaggat ctcaagaaga tcctttgatc 
gaaaactcac gttaagggat fcfctggtcatg 
cttttaaatt aaaaatgaag ttttaaatca 
gacagfctacc aatgcttaat cagtgaggca 
tccatagttg cctgactccc cgtcgtgtag 
ggccccagtg ctgcaatgat accgcgagac 
ataaaccagc cagccggaag ggccgagcgc 
atccagtcta ttaattgttg ccgggaagct 
cgcaacgttg ttgccattgc tacaggcatc 
tcattcagct ccggttccca acgatcaagg 
aaagcggtta gctccttcgg tcctccgatc 
tcactcatgg ttatggcagc actgcataat 
ttttctgtga ctggtgagta ctcaaccaag 
agttgctctt gcccggcgtc aatacgggat 
gtgctcatca ttggaaaacg ttcttcgggg 
agatccagtt cgatgtaacc cactcgtgca 
accagcgttt ctgggfcgagc aaaaacagga 
gcgacacgga aatgttgaat actcatactc 
cagggttatt gtctcatgag cggatacata 
ggggtfcccgc gcacatttcc ccgaaaagfcg 
gccgaagcgg gctttattac caagcgaagc 
gggaagggcg atcggtgcgg gcctcttcgc 
ctgcaaggcg attaagttgg gtaacgccag 
cggccagtcc gtaatacgac tcacttaagg 
atgataagat acattgatga gtttggacaa 
tttatttgtg aaatttgtga tgctafctgct 
caagttgggg tgggcgaaga actccagcat 
ggcgtcccgg aaaacgattc cgaagcccaa 
atctcgtagc acgtgtcagt cctgctcctc 

<210> 115 
<211> 7600 
<212> DNA 

<213> Artificial Sequence 



tttgtttgca agcagcagafc tacgcgcaga 144 0 
ttttctacgg ggtctgacgc fccagtggaac 1500 
agattatcaa aaaggatctt cacctagatc 1560 
atctaaagta tatatgagta aacttggtct 1620 
cctatctcag cgatctgtct atttcgttca 1680 
ataactacga tacgggaggg cttaccatct 1740 
ccacgctcac cggctccaga tttatcagca 1800 
agaagtggtc ctgcaacttt atccgcctcc 186 0 
agagtaagta gtfccgccagt taatagtttg 192 0 
gtggtgtcac gcfccgtcgtt tggtatggct 1980 
cgagttacat gatcccccat gttgtgcaaa 2 040 
gttgtcagaa gtaagttggc cgcagtgtta 2100 
tctcttactg tcatgccatc cgtaagatgc 2160 
tcattctgag aatagtgtat gcggcgaccg 222 0 
aataccgcgc cacatagcag aactttaaaa 22 8 0 
cgaaaactct caaggatctt accgctgttg 2 34 0 
cccaactgat cttcagcatc ttttactttc 2400 
aggcaaaatg ccgcaaaaaa gggaataagg 2 46 0 
ttcctttttc aatattattg aagcatttat 2520 
ttfcgaatgta tttagaaaaa taaacaaata 2 580 
ccacctgacg tagttaacaa aaaaaagccc 264 0 
gccattcgcc attcaggctg cgcaactgtt 2700 
tattacgcca gctggcgaaa gggggatgtg 2760 
ggttttccca gtcacgacgt tgtaaaacga 2 82 0 
ccttgactag agggtcgacg gtatacagac 2 880 
accacaacta gaatgcagtg aaaaaaatgc 294 0 
ttatttgtaa ccattataag ctgcaataaa 3000 
gagatccccg cgctggagga tcatccagcc 3 060 
cctttcatag aaggcggcgg tggaatcgaa 312 0 
ggccacgaag tgcacg 316 6 



<220> 

<22 3> pl8attBZeo3 ' 6XHS4eGFP Plasmid 



<400> 115 

cagttgccgg ccgggtcgcg cagggcgaac 
gtcatggccg gcccggaggc gtcccggaag 
tacagctcgt ccaggccgcg cacccacacc 
tcctggaccg cgctgatgaa cagggtcacg 
tccacgaagt cccgggagaa cccgagccgg 
tcgcgcgcgg tgagcaccgg aacggcactg 
caagttagta taaaaaagca ggcttcaatc 
gtaatcaatt acggggtcat tagttcatag 
tacggtaaat ggcccgcctg gctgaccgcc 
gacgtatgtt cccatagtaa cgccaatagg 
tttacggtaa actgcccact tggcagtaca 
tattgacgtc aatgacggta aatggcccgc 
ggactttcct acttggcagt acatctacgt 
tgagccccac gttctgcttc actctcccca 
atttatttat tttttaatta ttttgtgcag 
ccaggcgggg cggggcgggg cgaggggcgg 
gccaatcaga gcggcgcgct ccgaaagttt 
ccctataaaa agcgaagcgc gcggcgggcg 
ccgctccgcg ccgcctcgcg ccgcccgccc 
gtgagcgggc gggacggccc ttctcctccg 
ctcgtttctt ttctgtggct gcgtgaaagc 
gggggggagc ggctcggggg gtgcgtgcgt 
cccgcgctgc ccggcggctg tgagcgctgc 
gtgtgcgcga ggggagcgcg gccgggggcg 
gaacaaaggc tgcgfcgcggg gtgtgtgcgt 
ggtcgggctg taaccccccc ctgcaccccc 
cgggtgcggg gctccgtgcg gggcgtggcg 



tcccgccccc acggctgctc gccgatctcg 60 
ttcgtggaca cgacctccga ccactcggcg 120 
caggccaggg tgttgtccgg caccacctgg 180 
tcgtcccgga ccacaccggc gaagtcgtcc 24 0 
tcggtccaga actcgaccgc tccggcgacg 300 
gtcaacttgg ccatggatcc agatttcgct 360 
ctgcagagaa gcttgatcta gttattaata 420 
cccatatatg gagttccgcg ttacataact 480 
caacgacccc cgcccattga cgtcaataat 54 0 
gactttccat tgacgtcaat gggtggacta 600 
tcaagtgtat catatgccaa gtacgccccc 66 0 
ctggcattat gcccagtaca tgaccttatg 720 
attagtcatc gctattacca tgggtcgagg 780 
tctccccccc ctccccaccc ccaattttgt 84 0 
cgatgggggc gggggggggg ggggcgcgcg 90 o 
ggcggggcga ggcggagagg tgcggcggca 96 0 
ccttttatgg cgaggcggcg gcggcggcgg 1020 
ggagtcgctg cgttgccttc gccccgtgcc 10 80 
cggctctgac tgaccgcgtt actcccacag 1140 
ggctgtaatt agcgcttggt ttaatgacgg 12 00 
cttaaagggc tccgggaggg ccctttgtgc 12 60 
gtgtgtgtgc gtggggagcg ccgcgtgcgg 1320 
gggcgcggcg cggggctttg tgcgctccgc 13 80 
gtgccccgcg gtgcgggggg gctgcgaggg 144 0 
gggggggtga gcagggggtg tgggcgcggc 1500 

ctccccgagt tgctgagcac ggcccggctt 15 6 0 
cggggctcgc cgtgccgggc ggggggtggc 162 0 




ggcaggtggg ggtgccgggc ggggcggggc 
ggcgcggcgg ccccggagcg ccggcggctg 
tttatggtaa tcgtgcgaga gggcgcaggg 
aaatctggga ggcgccgccg caccccctct 
gcaggaagga aatgggcggg gagggccttc 
atctccagcc tcggggctgc cgcaggggga 
ggggttcggc ttctggcgtg tgaccggcgg 
cttcttcttt ttcctacagc tcctgggcaa 
tggcaaagaa ttcgccacca tggtgagcaa 
catcctggtc gagctggacg gcgacgtaaa 
cgagggcgat gccacctacg gcaagctgac 
gcccgtgccc tggcccaccc tcgtgaccac 
ctaccccgac cacatgaagc agcacgactt 
ccaggagcgc accatcttct tcaaggacga 
gttcgagggc gacaccctgg tgaaccgcat 
cggcaacatc ctggggcaca agctggagta 
ggccgacaag cagaagaacg gcatcaaggt 
cggcagcgtg cagctcgccg accactacca 
gctgctgccc gacaaccact acctgagcac 
gaagcgcgat cacatggtcc tgctggagtt 
ggacgagctg tacaagtaag aattcactcc 
ggctggtgtg gccaatgccc tggctcacaa 
aaattatggg gacatcatga agccccttga 
tattttcatt gcaatagtgt gttggaattt 
gagggcaaat catttaaaac atcagaatga 
catatgctgg ctgccatgaa caaaggtggc 
cccctgctgt ccattcctta ttccatagaa 
tattttgttt tgtgttattt ttttctttaa 
tagccagatt tttcctcctc tcctgactac 
gaagatccct cgacctgcag cccaagcttg 
ccgccccgta tcccccaggt gtctgcaggc 
agcgatcccg tgccaccttc cccgtgcccg 
tgcgggggga gcgccggacc ggagcggagc 
gggagggacg taattacatc cctgggggct 
cgcggccccg tatcccccag gtgtctgcag 
aaagcgatcc cgtgccacct tccccgtgcc 
gatgcggggg gagcgccgga ccggagcgga 
gggggaggga cgtaattaca tccctggggg 
tccgcggccc cgtatccccc aggtgtctgc 
ggaaagcgat cccgtgccac cttccccgtg 
gggatgcggg gggagcgccg gaccggagcg 
gcgggggagg gacgtaatta catccctggg 
gatccgcggc cccgtatccc ccaggtgtct 
gaggaaagcg atcccgtgcc accttccccg 
cggggatgcg gggggagcgc cggaccggag 
tagcggggga gggacgtaat tacatccctg 
cggatccgcg gccccgtatc ccccaggtgt 
cagaggaaag cgatcccgtg ccaccttccc 
ctcggggatg cggggggagc gccggaccgg 
cctagcgggg gagggacgta attacatccc 
agcggatccg cggccccgta tcccccaggt 
ttcagaggaa agcgatcccg tgccaccttc 
ggctcgggga tgcgggggga gcgccggacc 
cccctagcgg gggagggacg taattacatc 
tgagcggatc cgcggggctg caggaattcg 
aattgttatc cgctcacaat tccacacaac 
tggggtgcct aatgagtgag ctaactcaca 
cagtcgggaa acctgtcgtg ccagctgcat 
ggtttgcgta ttgggcgctc ttccgcttcc 
cggctgcggc gagcggtatc agctcactca 
ggggataacg caggaaagaa catgtgagca 
aaggccgcgt tgctggcgtt tttccatagg 
cgacgctcaa gtcagaggtg gcgaaacccg 
cctggaagct ccctcgtgcg ctctcctgtt 
gcctttctcc cttcgggaag cgtggcgctt 
tcggtgtagg tcgttcgctc caagctgggc 
cgctgcgcct tatccggtaa ctatcgtctt 




-79- 

cgcctcgggc cggggagggc tcgggggagg 168 0 
tcgaggcgcg gcgagccgca gccattgcct 174 0 
acttcctttg tcccaaatct ggcggagccg 1800 
agcgggcgcg ggcgaagcgg tgcggcgccg i860 
gtgcgtcgcc gcgccgccgt ccccttctcc 192 0 
cggctgcctt cgggggggac ggggcagggc 198 0 
ctctagagcc tctgctaacc atgttcatgc 2040 
cgtgctggtt gttgtgctgt ctcatcattt 210 0 
gggcgaggag ctgttcaccg gggtggtgcc 2160 
cggccacaag ttcagcgtgt ccggcgaggg 222 0 
cctgaagttc atctgcacca ccggcaagct 2280 
cctgacctac ggcgtgcagt gcttcagccg 2340 
cttcaagtcc gccatgcccg aaggctacgt 2400 
cggcaactac aagacccgcg ccgaggtgaa 2460 
cgagctgaag ggcatcgact tcaaggagga 2520 
caactacaac agccacaacg tctatatcat 2580 
gaacttcaag atccgccaca acatcgagga 2640 
gcagaacacc cccatcggcg acggccccgt 2700 
ccagtccgcc ctgagcaaag accccaacga 2 760 
cgtgaccgcc gccgggatca ctctcggcat 2 82 0 
tcaggtgcag gctgcctatc agaaggtggt 2 880 
ataccactga gatctttttc cctctgccaa 2940 
gcatctgact tctggctaat aaaggaaatt 3 00 0 
tttgtgtctc tcactcggaa ggacatatgg 3 060 
gtatttggtt tagagtttgg caacatatgc 312 0 
tataaagagg tcatcagtat atgaaacagc 3180 
aagccttgac ttgaggttag atttttttta 3 24 0 
catccctaaa attttcctta catgttttac 3300 
tcccagtcat agctgtccct cttctcttat 3360 
catgcctgca ggtcgactct agtggatccc 3420 
tcaaagagca gcgagaagcg ttcagaggaa 3480 
ggctgtcccc gcacgctgcc ggctcgggga 3540 
cccgggcggc tcgctgctgc cccctagcgg 3 6O0 
ttgggggggg gctgtccccg tgagcggatc 3660 
gctcaaagag cagcgagaag cgttcagagg 3720 
cgggctgtcc ccgcacgctg ccggctcggg 3780 
gccccgggcg gctcgctgct gccccctagc 3 840 
ctttgggggg gggctgtccc cgtgagcgga 3900 
aggctcaaag agcagcgaga agcgttcaga 3 960 
cccgggctgt ccccgcacgc tgccggctcg 4020 
gagccccggg cggctcgctg ctgcccccta 4 080 
ggctttgggg gggggctgtc cccgtgagcg 4140 
gcaggctcaa agagcagcga gaagcgttca 4200 
tgcccgggct gtccccgcac gctgccggct 4260 
cggagccccg ggcggctcgc tgctgccccc 4320 
ggggctttgg gggggggctg tccccgtgag 43 80 
ctgcaggctc aaagagcagc gagaagcgtt 444 0 
cgtgcccggg ctgtccccgc acgctgccgg 45 O0 
agcggagccc cgggcggctc gctgctgccc 4560 
tgggggcttt gggggggggc tgtccccgtg 4620 
gtctgcaggc tcaaagagca gcgagaagcg 4680 
cccgtgcccg ggctgtcccc gcacgctgcc 4740 
ggagcggagc cccgggcggc tcgctgctgc 4800 
cctgggggct ttgggggggg gctgtccccg 4860 
taatcatggt catagctgtt tcctgtgtga 4920 
atacgagccg gaagcataaa gtgtaaagcc 4980 
ttaattgcgt tgcgctcact gcccgctttc 5040 
taatgaatcg gccaacgcgc ggggagaggc 5100 
tcgctcactg actcgctgcg ctcggtcgtt 5160 
aaggcggtaa tacggttatc cacagaatca 5220 
aaaggccagc aaaaggccag gaaccgtaaa 5280 
ctccgccccc ctgacgagca tcacaaaaat 5340 
acaggactat aaagatacca ggcgtttccc 5400 
ccgaccctgc cgcttaccgg atacctgtcc 5460 
tctcatagct cacgctgtag gtatctcagt 5520 
tgtgtgcacg aaccccccgt tcagcccgac 55 80 
gagtccaacc cggtaagaca cgacttatcg 5640 



-80- 



ccacfcggcag cagccactgg taacaggatt 
gagttcttga agtggtggcc taactacggc 
gctctgctga agccagttac cttcggaaaa 
accaccgctg gtagcggtgg ttttfcttgtt 
ggafcctcaag aagatccttt gatcttttct 
tcacgtfcaag ggattttggt catgagatta 
aattaaaaat gaagttttaa atcaatctaa 
taccaatgct taatcagtga ggcacctatc 
gttgcctgac tccccgtcgt gtagataact 
agtgctgcaa tgataccgcg agacccacgc 
cagccagccg gaagggccga gcgcagaagt 
tctattaatt gttgccggga agctagagta 
gttgttgcca ttgctacagg catcgtggtg 
agctccggtt cccaacgatc aaggcgagtt 
gttagctccfc tcggtcctcc gatcgttgtc 
atggtfcatgg cagcactgca taattctctt 
gtgactggtg agtactcaac caagtcattc 
tcttgcccgg cgtcaatacg ggataatacc 
atcattggaa aacgttcttc ggggcgaaaa 
agttcgatgfc aacccactcg tgcacccaac 
gfcttcfcgggt gagcaaaaac aggaaggcaa 
cggaaatgtt gaatactcat actcttcctt 
tattgtctca tgagcggata catatttgaa 
ccgcgcacat ttccccgaaa agtgccacct 
gcgggcttta ttaccaagcg aagcgccatt 
ggcgafccggt gcgggcctct tcgcfcatfcac 
ggcgattaag ttgggtaacg ccagggtttt 
gtccgtaata cgactcactt aaggccttga 
agatacattg atgagtttgg acaaaccaca 
tgfcgaaatfct gtgatgctat tgctttattt 
ggggt99g c £? aagaactcca gcatgagatc 
ccggaaaacg attccgaagc ccaacctttc 
tagcacgtgfc cagtcctgct cctcggccac 

<210> 116 
<211> 7631 
<212> DNA 

<213> Artificial Sequence 



agcagagcga ggtatgtagg cggtgctaca 5700 
tacactagaa ggacagtatt tggtatctgc 5760 
agagttggta gctcttgatc cggcaaacaa 5820 
tgcaagcagc agattacgcg cagaaaaaaa 5880 
acggggtctg acgctcagtg gaacgaaaac 594 0 
tcaaaaagga tcttcaccta gatcctttta 6000 
agtatatatg agtaaacttg gtctgacagt 6060 
tcagcgatct gtctatttcg ttcatccata 612 0 
acgatacggg agggcttacc atctggcccc 6180 
tcaccggctc cagatttatc agcaataaac 6240 
ggtcctgcaa ctttatccgc ctccatccag 63 0 0 
agtagttcgc cagttaatag tttgcgcaac 63 60 
tcacgctcgt cgtttggtat ggcttcattc 6420 
acatgatccc ccatgttgtg caaaaaagcg 64 80 
agaagtaagt tggccgcagt gttatcactc 6540 
actgtcatgc catccgtaag atgcttttct 6600 
tgagaatagt gtatgcggcg accgagttgc 666 0 
gcgccacata gcagaacttt aaaagtgctc 6720 
ctctcaagga tcttaccgct gttgagatcc 6780 
tgatcttcag catcttttac tttcaccagc 6840 
aatgccgcaa aaaagggaat aagggcgaca 6 900 
tttcaatatt attgaagcat ttatcagggt 6960 
tgtatttaga aaaataaaca aataggggtt 7 02 0 
gacgtagtta acaaaaaaaa gcccgccgaa 7 080 
cgccattcag gctgcgcaac tgttgggaag 7140 
gccagctggc gaaaggggga tgtgctgcaa 72 00 
cccagtcacg acgttgtaaa acgacggcca 7260 
ctagagggtc gacggtatac agacatgata 732 0 
actagaatgc agtgaaaaaa atgctttatt 7380 
gtaaccatta taagctgcaa taaacaagtt 744 0 
cccgcgctgg aggatcatcc agccggcgtc 7 500 
atagaaggcg gcggtggaat cgaaatctcg 7 560 
gaagtgcacg 7 600 



<220> 

<223> pl8attBZeo5 ' 6XHS4eGFP Plasmid 



<400> 116 

cagttgccgg ccgggtcgcg cagggcgaac 
gtcatggccg gcccggaggc gtcccggaag 
tacagctcgt ccaggccgcg cacccacacc 
tcctggaccg cgctgatgaa cagggtcacg 
tccacgaagt cccgggagaa cccgagccgg 
tcgcgcgcgg tgagcaccgg aacggcactg 
caagttagta taaaaaagca ggcttcaatc 
agccccgcgg atccgctcac ggggacagcc 
gtccctcccc cgctaggggg cagcagcgag 
ccccccgcat ccccgagccg gcagcgtgcg 
gggatcgctt tcctctgaac gcttctcgct 
acggggccgc ggatccgctc acggggacag 
acgtccctcc cccgctaggg ggcagcagcg 
ctccccccgc atccccgagc cggcagcgtg 
acgggatcgc tttcctctga acgcttctcg 
atacggggcc gcggatccgc tcacggggac 
ttacgtccct cccccgctag ggggcagcag 
cgctcccccc gcatccccga gccggcagcg 
gcacgggatc gctttcctct gaacgcttct 
ggatacgggg ccgcggatcc gctcacgggg 
aattacgtcc ctcccccgct agggggcagc 
ggcgctcccc ccgcatcccc gagccggcag 
tggcacggga tcgctttcct ctgaacgctt 
ggggatacgg ggccgcggat ccgctcacgg 



tcccgccccc acggctgctc gccgatctcg 6 0 
ttcgtggaca cgacctccga ccactcggcg 12 0 
caggccaggg tgttgtccgg caccacctgg 180 
tcgtcccgga ccacaccggc gaagtcgtcc 24 0 
tcggtccaga actcgaccgc tccggcgacg 30 0 
gtcaacttgg ccatggatcc agatttcgct 3 60 
ctgcagagaa gcttgatatc gaattcctgc 42 0 
cccccccaaa gcccccaggg atgtaattac 48 0 
ccgcccgggg ctccgctccg gtccggcgct 54 0 
gggacagccc gggcacgggg aaggtggcac 60 0 
gctctttgag cctgcagaca cctgggggat 66 0 
ccccccccca aagcccccag ggatgtaatt 72 0 
agccgcccgg ggctccgctc cggtccggcg 78 0 
cggggacagc ccgggcacgg ggaaggtggc 84 0 
ctgctctttg agcctgcaga cacctggggg 90 0 
agcccccccc caaagccccc agggatgtaa 960 
cgagccgccc ggggctccgc tccggtccgg 1020 
tgcggggaca gcccgggcac ggggaaggtg 10 80 
cgctgctctt tgagcctgca gacacctggg 1140 
acagcccccc cccaaagccc ccagggatgt 12 00 
agcgagccgc ccggggctcc gctccggtcc 12 60 
cgtgcgggga cagcccgggc acggggaagg 13 2 0 
ctcgctgctc tttgagcctg cagacacctg 13 80 
ggacagcccc cccccaaagc ccccagggat 144 0 



-81- 



gtaattacgt ccctcccccg ctagggggca 
ccggcgctcc ccccgcatcc ccgagccggc 
ggtggcacgg gatcgctttc ctctgaacgc 
tgggggatac ggggccgcgg atccgctcac 
atgtaattac gtccctcccc cgctaggggg 
gtccggcgct ccccccgcat ccccgagccg 
aaggtggcac gggatcgctt tcctctgaac 
cctgggggat acggggcggg ggatccacta 
tagttcatag cccatatatg gagttccgcg 
gctgaccgcc caacgacccc cgcccattga 
cgccaatagg gactttccat tgacgtcaat 
tggcagtaca tcaagtgtat catatgccaa 
aatggcccgc ctggcattat gcccagtaca 
acatctacgt attagtcatc gctattacca 
actctcccca tctccccccc ctccccaccc 
ttttgtgcag cgatgggggc gggggggggg 
cgaggggcgg ggcggggcga ggcggagagg 
ccgaaagttt ccttttatgg cgaggcggcg 
gcggcgggcg ggagtcgctg cgttgccttc 
ccgcccgccc cggctctgac tgaccgcgtt 
ttctcctccg ggctgtaatt agcgcttggt 
gcgtgaaagc cttaaagggc tccgggaggg 
gtgcgtgcgt gtgtgtgtgc gtggggagcg 
tgagcgctgc gggcgcggcg cggggctttg 
gccgggggcg gtgccccgcg gtgcgggggg 
gtgtgtgcgt gggggggtga gcagggggtg 
ctgcaccccc ctccccgagt tgctgagcac 
gggcgtggcg cggggctcgc cgtgccgggc 
ggggcggggc cgcctcgggc cggggagggc 
ccggcggctg tcgaggcgcg gcgagccgca 
gggcgcaggg acttcctttg tcccaaatct 
caccccctct agcgggcgcg ggcgaagcgg 
gagggccttc gtgcgtcgcc gcgccgccgt 
cgcaggggga cggctgcctt cgggggggac 
tgaccggcgg ctctagagcc tctgctaacc 
tcctgggcaa cgtgctggtt gttgtgctgt 
tggtgagcaa gggcgaggag ctgttcaccg 
gcgacgtaaa cggccacaag ttcagcgtgt 
gcaagctgac cctgaagttc atctgcacca 
tcgtgaccac cctgacctac ggcgtgcagt 
agcacgactt cttcaagtcc gccatgcccg 
tcaaggacga cggcaactac aagacccgcg 
tgaaccgcat cgagctgaag ggcatcgact 
agctggagta caactacaac agccacaacg 
gcatcaaggt gaacttcaag atccgccaca 
accactacca gcagaacacc cccatcggcg 
acctgagcac ccagtccgcc ctgagcaaag 
tgctggagtt cgtgaccgcc gccgggatca 
aattcactcc tcaggtgcag gctgcctatc 
tggctcacaa ataccactga gatctttttc 
agccccttga gcatctgact tctggctaat 
gttggaattt tttgtgtctc tcactcggaa 
atcagaatga gtatttggtt tagagtttgg 
caaaggtggc tataaagagg tcatcagtat 
ttccatagaa aagccttgac ttgaggttag 
ttttctttaa catccctaaa attttcctta 
tcctgactac tcccagtcat agctgtccct 
cccaagcttg catgcctgca ggtcgactct 
gtaatcatgg tcatagctgt ttcctgtgtg 
catacgagcc ggaagcataa agtgtaaagc 
attaattgcg ttgcgctcac tgcccgcttt 
ttaatgaatc ggccaacgcg cggggagagg 
ctcgctcact gactcgctgc gctcggtcgt 
aaaggcggta atacggttat ccacagaatc 
aaaaggccag caaaaggcca ggaaccgtaa 
gctccgcccc cctgacgagc atcacaaaaa 
gacaggacta taaagatacc aggcgtttcc 



gcagcgagcc gcccggggct ccgctccggt 15 00 
agcgtgcggg gacagcccgg gcacggggaa 156 0 
ttctcgctgc tctttgagcc tgcagacacc 162 0 
ggggacagcc cccccccaaa gcccccaggg 1680 
cagcagcgag ccgcccgggg ctccgctccg 174 0 
gcagcgtgcg gggacagccc gggcacgggg 1800 
gcttctcgct gctctttgag cctgcagaca 1860 
gttattaata gtaatcaatt acggggtcat 192 0 
ttacataact tacggtaaat ggcccgcctg 1980 
cgtcaataat gacgtatgtt cccatagtaa 2 040 
gggtggacta tttacggtaa actgcccact 2100 
gtacgccccc tattgacgtc aatgacggta 2160 
tgaccttatg ggactttcct acttggcagt 2220 
tgggtcgagg tgagccccac gttctgcttc 2280 
ccaattttgt atttatttat tttttaatta 2340 
ggggcgcgcg ccaggcgggg cggggcgggg 240 0 
tgcggcggca gccaatcaga gcggcgcgct 2460 
gcggcggcgg ccctataaaa agcgaagcgc 2 520 
gccccgtgcc ccgctccgcg ccgcctcgcg 25 80 
actcccacag gtgagcgggc gggacggccc 2 640 
ttaatgacgg ctcgtttctt ttctgtggct 2700 
ccctttgtgc gggggggagc ggctcggggg 2760 
ccgcgtgcgg cccgcgctgc ccggcggctg 2 820 
tgcgctccgc gtgtgcgcga ggggagcgcg 2 8 80 
gctgcgaggg gaacaaaggc tgcgtgcggg 2940 
tgggcgcggc ggtcgggctg taaccccccc 3 000 
ggcccggctt cgggtgcggg gctccgtgcg 3060 
ggggggtggc ggcaggtggg ggtgccgggc 3120 
tcgggggagg ggcgcggcgg ccccggagcg 3180 
gccattgcct tttatggtaa tcgtgcgaga 3240 
ggcggagccg aaatctggga ggcgccgccg 33 00 
tgcggcgccg gcaggaagga aatgggcggg 33 60 
ccccttctcc atctccagcc tcggggctgc 3420 
ggggcagggc ggggttcggc ttctggcgtg 34 80 
atgttcatgc cttcttcttt ttcctacagc 3540 
ctcatcattt tggcaaagaa ttcgccacca 3 6 00 
gggtggtgcc catcctggtc gagctggacg 3 660 
ccggcgaggg cgagggcgat gccacctacg 3 720 
ccggcaagct gcccgtgccc tggcccaccc 37 80 
gcttcagccg ctaccccgac cacatgaagc 3 840 
aaggctacgt ccaggagcgc accatcttct 3 900 
ccgaggtgaa gttcgagggc gacaccctgg 3 960 
tcaaggagga cggcaacatc ctggggcaca 4 02 0 
tctatatcat ggccgacaag cagaagaacg 40 80 
acatcgagga cggcagcgtg cagctcgccg 4140 
acggccccgt gctgctgccc gacaaccact 42 00 
accccaacga gaagcgcgat cacatggtcc 4 2 60 
ctctcggcat ggacgagctg tacaagtaag 4320 
agaaggtggt ggctggtgtg gccaatgccc 43 8 0 
cctctgccaa aaattatggg gacatcatga 444 0 
aaaggaaatt tattttcatt gcaatagtgt 4500 
ggacatatgg gagggcaaat catttaaaac 4 56 0 
caacatatgc catatgctgg ctgccatgaa 462 0 
atgaaacagc cccctgctgt ccattcctta 4680 
atttttttta tattttgttt tgtgttattt 4740 
catgttttac tagccagatt tttcctcctc 4800 
cttctcttat gaagatccct cgacctgcag 4860 
agaggatccc cgggtaccga gctcgaattc 492 0 
aaattgttat ccgctcacaa ttccacacaa 4 980 
ctggggtgcc taatgagtga gctaactcac 5 04 0 
ccagtcggga aacctgtcgt gccagctgca 510 0 
cggtttgcgt attgggcgct cttccgcttc 5160 
tcggctgcgg cgagcggtat cagctcactc 522 0 
aggggataac gcaggaaaga acatgtgagc 52 80 
aaaggccgcg ttgctggcgt ttttccatag 534 0 
tcgacgctca agtcagaggt ggcgaaaccc 54 0O 
ccctggaagc tccctcgtgc gctctcctgt 5460 
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tccgaccctg ccgcttaccg gatacctgtc 
ttctcatagc tcacgctgta ggtatctcag 
ctgtgtgcac gaaccccccg ttcagcccga 
tgagtccaac ccggtaagac acgacttatc 
tagcagagcg aggtatgtag gcggtgctac 
ctacactaga aggacagtat ttggtatctg 
aagagttggt agctcttgat ccggcaaaca 
ttgcaagcag cagattacgc gcagaaaaaa 
tacggggtcfc gacgctcagt ggaacgaaaa 
atcaaaaagg atcttcacct agatcctttt 
aagtatatat gagtaaactt ggtctgacag 
ctcagcgatc tgtctatttc gttcatccat 
tacgatacgg gagggcttac catctggccc 
ctcaccggct ccagatttat cagcaataaa 
tggtcctgca actttatccg cctccatcca 
aagtagttcg ccagttaata gtttgcgcaa 
gtcacgctcg tcgtttggta tggcttcatt 
tacatgatcc cccatgttgt gcaaaaaagc 
cagaagtaag ttggccgcag tgttatcact 
tactgtcatg ccatccgtaa gatgcttttc 
ctgagaatag tgtatgcggc gaccgagttg 
cgcgccacafc agcagaactt taaaagtgct 
actctcaagg atcttaccgc tgttgagatc 
ctgatcttca gcatctttta ctttcaccag 
aaatgccgca aaaaagggaa taagggcgac 
ttttcaatat tattgaagca tttatcaggg 
atgtatttag aaaaataaac aaataggggt 
tgacgtagtt aacaaaaaaa agcccgccga 
tcgccattca ggctgcgcaa ctgttgggaa 
cgccagctgg cgaaaggggg atgtgctgca 
tcccagtcac gacgttgtaa aacgacggcc 
actagagggt cgacggtata cagacatgat 
aactagaatg cagtgaaaaa aatgctttat 
tgtaaccatt ataagctgca ataaacaagt 
ccccgcgctg gaggatcatc cagccggcgt 
catagaaggc ggcggfcggaa tcgaaatctc 
cgaagtgcac g 

<210> 117 

<211> 4615 

<212> DNA 

<213> Artificial Sequence 



cgcctttctc ccttcgggaa gcgtggcgct 5520 
ttcggtgtag gtcgttcgct ccaagctggg 5580 
ccgctgcgcc ttatccggta actafccgtct 564 0 
gccactggca gcagccactg gtaacaggat 57 0 0 
agagttcttg aagtggtggc ctaactacgg 5760 
cgctctgctg aagccagtta ccttcggaaa 582 0 
aaccaccgct ggtagcggtg gfctfctfcttgt 58 80 
aggatctcaa gaagatcctt tgatcttttc 5940 
ctcacgttaa gggattfcfcgg tcatgagatt 6000 
aaattaaaaa tgaagtttta aatcaatcta 6060 
ttaccaatgc ttaatcagtg aggcacctat 612 0 
agttgcctga ctccccgtcg tgtagataac 6180 
cagtgctgca atgataccgc gagacccacg 6240 
ccagccagcc ggaagggccg agcgcagaag 63 0 0 
gtctattaat tgttgccggg aagctagagt 63 60 
cgttgttgcc attgctacag gcatcgtggt 642 0 
cagctccggt tcccaacgat caaggcgagt 64 8 0 
ggttagctcc ttcggtcctc cgatcgttgt 654 0 
catggttatg gcagcactgc ataattctct 660 0 
tgtgactggt gagtactcaa ccaagtcatt 6660 
ctcttgcccg gcgtcaatac gggataatac 672 0 
catcattgga aaacgttctt cggggcgaaa 6780 
cagttcgatg taacccacfcc gtgcacccaa 6840 
cgtttctggg tgagcaaaaa caggaaggca 69 0 0 
acggaaatgt tgaatactca tactcttcct 6960 
ttattgtctc atgagcggat acatatttga 702 0 
tccgcgcaca tttccccgaa aagtgccacc 7080 
agcgggcttt attaccaagc gaagcgccat 7140 
gggcgatcgg tgcgggcctc ttcgctatta 7200 
aggcgattaa gttgggtaac gccagggttt 7260 
agtccgtaat acgactcact taaggccttg 7320 
aagatacatt gatgagtttg gacaaaccac 7380 
ttgtgaaatt tgtgatgcta ttgctttatt 7440 
tggggtgggc gaagaactcc agcatgagat 7500 
cccggaaaac gattccgaag cccaaccttt 7 560 
gtagcacgtg tcagtcctgc tcctcggcca 762 0 



<220> 

<223> pl8atfcBZeo6XHS4 Plasmid 



<400> 117 

cagttgccgg 

gtcatggccg 

tacagctcgt 

tcctggaccg 

tccacgaagt 

tcgcgcgcgg 

caagttagta 

actctagtgg 

aagcgttcag 

ct'gccggctc 

gctgccccct 

ccccgtgagc 

agaagcgttc 

cgctgccggc 

ctgctgcccc 

gtccccgtga 

cgagaagcgt 

cacgctgccg 

cgctgctgcc 

ctgtccccgt 



ccgggtcgcg 
gcccggaggc 
ccaggccgcg 
cgctgatgaa 
cccgggagaa 
t gage a c egg 
taaaaaagca 
atcccccgcc 
aggaaagega 

ggggatgegg 
agegggggag 
ggatccgegg 
agaggaaagc 
t egggga t gc 
etageggggg 
gcggatccgc 
tcagaggaaa 
geteggggat 
ccctagcggg 
gageggatec 



cagggegaac 
gtcccggaag 
cacccacacc 
cagggtcacg 
cccgagccgg 
aacggcactg 
ggcttcaatc 
ccgtatcccc 
fccccgtgcca 

ggggagegee 

ggaegtaatt 
ccccgtatcc 
gatcccgtgc 

ggggggagcg 

agggaegtaa 
ggccccgtat 
gcgatcccgt 
geggggggag 
ggagggacgt 
gcggccccgt 



tcccgccccc 
ttcgtggaca 
caggecaggg 
tcgtcccgga 
teggtccaga 
gtcaacttgg 
ctgcagagaa 
caggtgtctg 
ccttccccgt 
ggaceggage 
acatccctgg 
cccaggtgtc 
caccttcccc 
ccggaccgga 
ttacatccct 
cccccaggtg 
gccaccttcc 
cgccggaccg 
aattacatcc 
atcccccagg 



acggctgctc 
cgacctccga 
fcgttgtccgg 
ccacaccggc 
actcgaccgc 
ccatggatcc 
gettgeatge 
caggctcaaa 
gcccgggctg 
ggagccccgg 
gggctttggg 
tgeaggctea 
gtgcccgggc 
gcggagcccc 

gggggctttg 

tetgeagget 
ccgtgcccgg 
gageggagee 
ctgggggctt 
tgtctgeagg 



gccgatctcg 
ccactcggcg 
caccacctgg 
gaagtegtec 
tccggcgacg 
agatttcget 
ctgeaggteg 
gagcagegag 
tccccgcacg 
gcggctcgct 
ggggggctgt 

aagagcagcg 
tgtccccgca 

gggeggcteg 
gggggggg ct 

caaagagcag 
gctgtccccg 
ccgggcggct 
tggggggggg 
ctcaaagagc 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 
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agcgagaagc 
cgcacgctgc 
ctcgctgctg 
ggctgtcccc 
gcagcgagaa 
cccgcacgct 
ggctcgctgc 

ggggctgtcc 

gagcagcgag 
tccccgcacg 
gcggctcgct 

ggggggctgt 

ctgtttcctg 
afcaaagtgfca 
tcactgcccg 
cgcgcgggga 
ctgcgctcgg 
ttatccacag 
gc c aggaac c 
gagcatcaca 
taccaggcgt 
accggatacc 
tgtaggtatc 
cccgttcagc 
agacacgact 
gtaggcggtg 
gtatttggta 
tgatccggca 
acgcgcagaa 
cagtggaacg 
acctagatcc 
acttggtctg 
tttcgttcat 
ttaccatctg 
ttatcagcaa 
tccgcctcca 
aatagtttgc 
ggfcatggctt 
ttgtgcaaaa 
gcagtgttat 
gtaagatgct 
cggcgaccga 
actttaaaag 
ccgctgttga 
tttactttca 
ggaataaggg 
agcatttatc 
aaacaaatag 
aaaaagcccg 
gcaacfcgttg 

ggggatgtgc 

gtaaaacgac 
tatacagaca 
aaaaaatgct 
tgcaataaac 
catccagccg 
ggaatcgaaa 



gfcfccagagga 

cggctcgggg 

ccccctagcg 

gtgagcggat 

gcgttcagag 

gccggcfccgg 

tgccccctag 

ccgtgagcgg 

aagcgttcag 

ctgccggctc 

gctgccccct 

ccccgtgagc 

tgtgaaattg 

aagcctgggg 

ctttccagtc 

gaggcggttt 

tcgttcggct 

aatcagggga 

gtaaaaaggc 

aaaatcgacg 

ttccccctgg 

tgtccgcctt 

tcagttcggt 

ccgaccgctg 

tatcgccact 

ctacagagtt 

tctgcgctct 

aacaaaccac 

aaaaaggatc 

aaaactcacg 

ttttaaatta 

acagttacca 

ccatagttgc 

gccccagtgc 

taaaccagcc 

tccagtctat 

gcaacgttgt 

cattcagctc 

aagcggttag 

cactcatggt 

tttctgtgac 

gttgctcfctg 

tgctcatcat 

gatccagttc 

ccagcgtttc 

cgacacggaa 

agggttattg 

gggfctccgcg 

ccgaagcggg 

ggaagggcga 

tgcaaggcga 

ggccagtccg 

tgataagata 

fctatttgtga 

aagttggggt 

gcgtcccgga 

tctcgtagca 



aagcgatccc 
atgcgggggg 
ggggagggac 
ccgcggcccc 
gaaagcgatc 
ggatgcgggg 
cgggggaggg 
atccgcggcc 
aggaaagcga 

ggggatgcgg 

agcgggggag 

ggatccgcgg 

ttatccgctc 

fcgccfcaatga 

gggaaacctg 

gcgtattggg 

gcggcgagcg 

taacgcagga 

cgcgttgctg 

ctcaagtcag 

aagctccctc 

tctcccttcg 

gtaggtcgtt 

cgccttatcc 

ggcagcagcc 

cttgaagtgg 

gctgaagcca 

cgctggtagc 

tcaagaagat 

ttaagggatt 

aaaatgaagt 

atgcttaatc 

ctgactcccc 

tgcaatgata 

agccggaagg 

taattgttgc 

tgccattgct 

cggttcccaa 

ctccttcggt 

tatggcagca 

tggtgagtac 

cccggcgtca 

tggaaaacgt 

gatgtaaccc 

tgggtgagca 

atgttgaata 

tctcatgagc 

cacatttccc 

ctttatfcacc 

tcggtgcggg 

ttaagfctggg 

taatacgact 

cattgatgag 

aatttgtgat 

gggcgaagaa 

aaacgattcc 

cgtgtcagtc 



gtgccacctt 
agcgccggac 
gtaattacat 
gtatccccca 
ccgtgccacc 
ggagcgccgg 
acgtaattac 
ccgtatcccc 
tcccgtgcca 

ggggagcgcc 

ggacgtaatt 

ggctgcagga 

acaattccac 

gtgagctaac 

tcgtgccagc 

cgctcttccg 

gtatcagctc 

aagaacatgt 

gcgtttttcc 

aggtggcgaa 

gtgcgctctc 

ggaagcgtgg 

cgctccaagc 

ggtaactatc 

actggtaaca 

tggcctaact 

gttaccttcg 

ggtggttttt 

cctttgatct 

ttggtcatga 

tttaaatcaa 

agtgaggcac 

gtcgtgtaga 

ccgcgagacc 

gccgagcgca 

cgggaagcta 

acaggcatcg 

cgatcaaggc 

cctccgatcg 

ctgcataatt 

tcaaccaagt 

atacgggata 

tcttcggggc 

actcgtgcac 

aaaacaggaa 

ctcatactct 

ggatacatat 

cgaaaagtgc 

aagcgaagcg 

cctcttcgct 

taacgccagg 

cacttaaggc 

tttggacaaa 

gctattgctt 

ctccagcatg 

gaagcccaac 

ctgctcctcg 



ccccgtgccc 

cggagcggag 

ccctgggggc 

ggtgtctgca 

ttccccgtgc 

accggagcgg 

atccctgggg 

caggtgtctg 

ccttccccgt 

ggaccggagc 

acatccctgg 

attcgtaatc 

acaacafcacg 

tcacattaat 

tgcattaatg 

cttcctcgct 

actcaaaggc 

gagcaaaagg 

ataggctccg 

acccgacagg 

ctgttccgac 

cgctttctca 

tgggctgtgt 

gtcttgagtc 

ggattagcag 

acggctacac 

gaaaaagagt 

ttgtttgcaa 

tttctacggg 

gattatcaaa 

tctaaagtat 

ctatctcagc 

taactacgat 

cacgctcacc 

gaagtggtcc 

gagtaagtag 

tggtgtcacg 

gagttacatg 

ttgtcagaag 

ctcttactgt 

cattctgaga 

ataccgcgcc 

gaaaactctc 

ccaactgatc 

ggcaaaatgc 

tcctttttca 

ttgaatgtat 

cacctgacgt 

ccattcgcca 

attacgccag 

gttttcccag 

cttgactaga 

ccacaactag 

tatttgtaac 

agatccccgc 

ctttcataga 

gccacgaagt 



gggctgtccc 

ccccgggcgg 

tttggggggg 

ggctcaaaga 

ccgggctgtc 

agccccgggc 

gctttggggg 

caggctcaaa 

gcccgggctg 

ggagc c c egg 

gggctttggg 

atggtcatag 

ageeggaage 

tgcgttgcgc 

aatcggccaa 

cactgacfccg 

ggtaataegg 

ccagcaaaag 

cccccctgac 

actataaaga 

cctgccgctt 

tagctcacgc 

gcacgaaccc 

caacccggta 

agegaggtat 

tagaaggaca 

fcggtagctct 

gcagcagatt 

gtctgacget 

aaggatcttc 

atatgagtaa 

gatctgtcta 

aegggaggge 

ggctccagat 

tgcaacttta 

ttcgccagtt 

ctcgtcgttt 

atcccccatg 

taagttggcc 

catgccatcc 

atagtgtatg 

acatagcaga 

aaggatctta 

ttcagcatct 

cgcaaaaaag 

atattattga 

ttagaaaaat 

agttaacaaa 

ttcaggctgc 

ctggcgaaag 

teacgaegtt 

gggtcgaegg 

aatgcagtga 

cattataagc 

gctggaggat 

aggeggeggt 

gcacg 



1260 

1320 

1380 

144 0 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

4020 

4080 

4140 

4200 

4260 

4320 

4380 

4440 

4500 

4560 

4615 



<210> 118 
<211> 17384 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pFK161 Plasmid 



<400> 118 



gcgcacgagg gagcttccag ggggaaacgc 
gccacctctg acttgagcgt cgatttttgt 
aaaacgccag caacgcggcc tttttacggt 
tgttctttcc tgcgttatcc cctgattctg 
ctgataccgc tcgccgcagc cgaacgaccg 
aagagcgctg acttccgcgt ttccagactt 
gttgttgctc aggtcgcaga cgttttgcag 
ggtgattcat tctgctaacc agtaaggcaa 
aggagcacga tcatgcgcac ccgtcagatc 
gacaaaccac aactagaatg cagtgaaaaa 
ttgctttatt tgtaaccatt ataagctgca 
attttatgtt tcaggttcag ggggaggtgt 
acaaatgtgg tatggctgat tatgatctct 
attaacccct ttacaaatta aaaagctaaa 
agcagacact ctatgcctgt gtggagtaag 
atgcctactt ataaaggtta cagaatattt 
tttttccttt gtggtgtaaa tagcaaagca 
actcaaaaaa cttagcaatt ctgaaggaaa 
ttttggagga gtagaatgtt gagagtcagc 
ttctgagcaa aacaggtttt cctcattaaa 
tccataggtt ggaatctaaa atacacaaac 
acttaaaaat tttatattta ccttagagct 
tgtcacacca cagaagtaag gttccttcac 
tccccactcc tgcagttcgg gggcatggat 
gccgacggat ttgcactgcc ggtagaactc 
aaccaactcg cgaggggatc gagcccgggg 
cgctggagga tcatccagcc ggcgtcccgg 
aaggcggcgg tggaatcgaa atctcgtgat 
tcgaacccca gagtcccgct cagaagaact 
gcgaatcggg agcggcgata ccgtaaagca 
gctcttcagc aatatcacgg gtagccaacg 
gccggccaca gtcgatgaat ccagaaaagc 
aggcatcgcc atgggtcacg acgagatcct 
gaacagttcg gctggcgcga gcccctgatg 
accggcttcc atccgagtac gtgctcgctc 
gcaggtagcc ggatcaagcg tatgcagccg 
ctcggcagga gcaaggtgag atgacaggag 
ccagtccctt cccgcttcag tgacaacgtc 
ggccagccac gatagccgcg ctgcctcgtc 
ggtcttgaca aaaagaaccg ggcgcccctg 
gcagccgatt gtctgttgtg cccagtcata 
agaacctgcg tgcaatccat cttgttcaat 
atcagatctt gatcccctgc gccatcagat 
tttgcagggc ttcccaacct taccagaggg 
tgtccataaa accgcccagt ctagctatcg 
tctctttgcg cttgcgtttt cccttgtcca 
tcagcaccgt ttctgcggac tggctttcta 
ccctgagtgc ttgcggcagc gtgaaagctt 
tcctcactac ttctggaata gctcagaggc 
gccatggggc ggagaatggg cggaactggg 
gggcgggact atggttgctg actaattgag 
ggagcctggg gactttccac acctggttgc 
tctgcctgct ggggagcctg gggactttcc 
gatctgcagg acccaacgct gcccgagatg 
gcgatggata tgttctgcca agggttggtt 
ttggctccaa ttcttggagt ggtgaatccg 
tcgaggtggc ccggctccat gcaccgcgac 
cggcgcctac aatccatgcc aacccgttcc 
gacgatcagc ggtccaatga tcgaagttag 
ctgtccctga tggtcgtcat ctacctgcct 
atgccgccgg aagcgagaag aatcataatg 
gccagcaaga cgtagcccag cgcgtcgggc 
gccgaaacgt ttggtggcgg gaccagtgac 
gaataccgca agcgacaggc cgatcatcgt 
aatgacccag agcgctgccg gcacctgtcc 
aagtgcggcg acgatagtca tgccccgcgc 
tctcaagggc atcggtcgac gctctccctt 
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ctggtatctt tatagtcctg tcggggtttc 60 
gatgctcgtc aggggggcgg agcctatgga 120 
tcctggcctt ttgctggcct tttgctcaca 180 
tggataaccg tattaccgcc tttgagtgag 240 
agcgcagcga gtcagtgagc gaggaagcgg 3 00 
tacgaaacac ggaaaccgaa gaccattcat 3 60 
cagcagtcgc ttcacgttcg ctcgcgtatc 420 
ccccgccagc ctagccgggt cctcaacgac 4 80 
cagacatgat aagatacatt gatgagtttg 54 0 
aatgctttat ttgtgaaatt tgtgatgcta 600 
ataaacaagt taacaacaac aattgcattc 660 
gggaggtttt ttaaagcaag taaaacctct 720 
agtcaaggca ctatacatca aatattcctt 780 
ggtacacaat ttttgagcat agttattaat 84 0 
aaaaaacagt atgttatgat tataactgtt 900 
ttccataatt ttcttgtata gcagtgcagc 960 
agcaagagtt ctattactaa acacagcatg 1020 
gtccttgggg tcttctacct ttctcttctt 1080 
agtagcctca tcatcactag atggcatttc 1140 
ggcattccac cactgctccc attcatcagt 1200 
aattagaatc agtagtttaa cacattatac 1260 
ttaaatctct gtaggtagtt tgtccaatta 1320 
aaagatccgg accaaagcgg ccatcgtgcc 13 80 
gcgcggatag ccgctgctgg tttcctggat 1440 
gcgaggtcgt ccagcctcag gcagcagctg 1500 
tgggcgaaga actccagcat gagatccccg 15 60 
aaaacgattc cgaagcccaa cctttcatag 1620 
ggcaggttgg gcgtcgcttg gtcggtcatt 1680 
cgtcaagaag gcgatagaag gcgatgcgct 174 0 
cgaggaagcg gtcagcccat tcgccgccaa 1800 
ctatgtcctg atagcggtcc gccacaccca 1860 
ggccattttc caccatgata ttcggcaagc 1920 
cgccgtcggg atgcgcgcct tgagcctggc 1980 
ctcttcgtcc agatcatcct gatcgacaag 2 040 
gatgcgatgt ttcgcttggt ggtcgaatgg 2100 
ccgcattgca tcagccatga tggatacttt 2160 
atcctgcccc ggcacttcgc ccaatagcag 2220 
gagcacagct gcgcaaggaa cgcccgtcgt 22 80 
ctgcagttca ttcagggcac cggacaggtc 2340 
cgctgacagc cggaacacgg cggcatcaga 24 00 
gccgaatagc ctctccaccc aagcggccgg 2460 
catgcgaaac gatcctcatc ctgtctcttg 2520 
ccttggcggc aagaaagcca tccagtttac 2 580 
cgccccagct ggcaattccg gttcgcttgc 2 640 
ccatgtaagc ccactgcaag ctacctgctt 2 7 00 
gatagcccag tagctgacat tcatccgggg 2 760 
cgtgttccgc ttcctttagc agcccttgcg 2 820 
tttgcaaaag cctaggcctc caaaaaagcc 2 880 
cgaggcggcc taaataaaaa aaattagtca 2 94 0 
cggagttagg ggcgggatgg gcggagttag 3 0 00 
atgcatgctt tgcatacttc tgcctgctgg 3 060 
tgactaattg agatgcatgc tttgcatact 312 0 
acaccctaac tgacacacat tccacagccg 3180 
cgccgcgtgc ggctgctgga gatggcggac 3 24 0 
tgcgcattca cagttctccg caagaattga 3 3 00 
ttagcgaggt gccgccggct tccattcagg 33 6 0 
gcaacgcggg gaggcagaca aggtataggg 3 42 0 
atgtgctcgc cgaggcgcat aaatcgccgt 348 0 
gctggtaaga gccgcgagcg atccttgaag 3 54 0 
ggacagcatg gcctgcaacg cggcatcccg 3 60 0 
gggaaggcca tccagcctcg cgtcgcgaac 3 660 
cgccatgccg gcgataatgg cctgcttctc 3 72 0 
gaaggcttga gcgagggcgt gcaagattcc 3 780 
cgcgctccag cgaaagcggt cctcgccgaa 3 84 0 
tacgagttgc atgataaaga agacagtcat 3 90 0 
ccaccggaag gagctgactg ggttgaaggc 3 96 0 
atgcgactcc tgcattagga agcagcccag 4 02 0 
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tagtaggttg 

gcccaacagt 

agcccgaagt 

accgcacctg 

gtcacagcat 

tgtagtaccc 

ctcatggtaa 

aattgaaaca 

cgatggtcaa 

ttgcgaactg 

caaatagtca 

catcgcacgc 

tcggcggctt 

attatcacgt 

ttgctgtctg 

gctgaaacca 

tcagaagggc 

tattgcgctt 

gacgagctgg 

tcattcatca 

cctcagccgg 

cagaacaagg 

attgaaacgt 

acaaagcaaa 

gccatcatga 

gcaattgata 

gcgacctcgc 

cttcgtcata 

tgctgaaagc 

aacaatggaa 

tcagaactgg 

gctttatgac 

cgaaaagctg 

cgggtgtggt 

ctgggcggcg 

caacgcatat 

atcccgcaag 

gacggtgccg 

ttagcaattt 

atgagaattc 

cgtctgccgg 

cgcacttttc 

tcacgtgttt 

ccggtggcgt 

gaggtgctcc 

gcgctcccca 

tgtctgagaa 

tcgtcgggtg 

ggtcgcggct 

gagaggcctg 

aatgcccttg 

ttggtcttct 

gtcggggttt 

ggaaagggtg 

ctcgccccct 

ggcctccccg 

ccgttgctgc 

gcacaccccc 

tgggtaggcg 

tccgtcgcgt 

ctgcgccgcg 

ccccccttcc 

cctcggggtc 

gttctgtggg 

gccgctcggg 

cggtgtcgcc 

ggtgtggtgg 



aggccgttga 
cccccggcca 
ggcgagcccg 
tggcgccggt 
gcgcatatcc 
acatcgtcat 
tagtccatga 
aaagagatgg 
tgcgctggat 
ttcccaacta 
ggtaatgaat 
gcacaccgta 
tgctgtgcga 
tgtccggcgc 
gtgatctgcc 
gacacacagc 
agaaatttgc 
cgatgacgct 
accagcgcat 
aggacgccgc 
tgaccaatat 
taaccgtcag 
tgatcgaaaa 
tggcagcaga 
tggaatgttt 
attattatca 

gggttttcgc 
acttaatgtt 
gagctttttg 
gtcaacaaaa 
caggaacagg 
tctgccgccg 
cgccgggagg 
cgccatgatc 
gcaaagcggt 
agcgctagca 
aggcccggca 
aggatgacga 
aactgtgata 
gcggccgctc 
tggtgtgtgg 
tcagtggttc 
cactttggtc 
tgcataccct 
tggagcgttc 
ttccctggtg 
gcccgtgaga 
aggcgcccac 
ggggttggaa 
gctttcgggg 
gaagagaacc 
ggtttccctg 
tgggtccgtc 
cgggcttctt 
gaccgcctcc 
ctccgagttc 
ggagcatgtg 
gcgtgcgcgt 
acggtgggct 
gcgtccctct 
cgtggtgcgt 
cgcggcagcg 
gagagggtcc 
agaacggctg 
ggtcttcgtc 
tcctcgggct 
gactgctcag 



gcaccgccgc 
cgggcctgcc 
atcttcccca 
gatgccggcc 
atgcttcgac 
cgctttccac 
aaatccttgt 
tgatctttct 
atgggataga 
aaatcatttt 
cctgatataa 
gaaagtcttt 
caggctcacg 
ggcgacggat 
ttctaaatct 
aactgaatac 
cgttgaacac 
tggcgttgag 
tcgtgacacc 
tatcgcaaat 
ctacaacatc 
tgccgataag 
cgcgctgaaa 
caagaaagcg 
ccccggtggt 
tttgcgggtc 
tatttatgaa 
tttatttaaa 
gcctctgtcg 
agcagctggc 
gaatgcccgt 
tcataaaatg 
1 1 gaagaact 
gcgtagtcga 
cggacagtgc 
gcacgccata 
gtaccggcat 
tgagcgcatt 
aactaccgca 
ttctcgttct 
aaggcagggg 
gcgtggtcct 
gtgtctcgct 
tcccgtctgg 
caggtttgtc 
tgcctccggt 

ggggggtcga 
cccgcgacta 
agtttctcga 
gggaccggtt 
ttcctgttgc 
tgtgctcgtc 
ccgccctcag 
acggtctcga 
cgcgcgcgca 
ggggagggat 
gctcggcttg 
actttcctcc 
cccgggtccc 
cgctcgcgtc 
gctgtgtgct 
ttcccacggc 
gtgtctggcg 
ttggccgcgt 
ggtaggcatc 
cccggggggc 
gggagtggtg 



cgcaaggaat 

accataccca 

tcggtgatgt 

acgatgcgtc 

catgcgctca 

tgctctcgcg 

attcataaat 

aagagatgat 

tgggaatatg 

gcacgatcag 

agacaggttg 

cagttgtgag 

tctaaaagga 

gttctgtatg 

ggcacagccg 

cagaaagaaa 

ctggtcaata 

attgatacct 

gtctccttcg 

ggtgctatcc 

agccttggta 

ttcaaagtta 

aacgctgctg 

atggatgaac 

gttatctggc 

ctttccggcg 

aattttccgg 

ataccctctg 

tttcctttct 

tgacattttc 

tctgcgaggc 

gtatgccgaa 

gcggcaggcc 

tagtggctcc 

tccgagaacg 

gtgactggcg 

aaccaagcct 

gttagatttc 

ttaaagctta 

gccagcgggc 

tgcggctctc 

tgtggatgtg 

tgaccatgtt 

tgtgtgcacg 

tcctaggtgc 

gctccgtctg 

ggagagaagg 

gtacgcctgt 

gaga c t cat t 

gcagggtctc 

cgcagacccc 

gcatgcatcc 

tgagaaagtt 

ggggtctctc 

gcgtttgctc 

cacgcggggc 

tgtggttggt 

cctcctgagg 

cacccgtctt 

cacgactttg 

tctcgggctg 

tggcgaaatc 

ttgattgatc 

ccggcgcgac 

ggtgtgtcgg 

cgtcgtgttt 
cagtgtgatt 



ggtgcatgca 
cgccgaaaca 
cggcgatata 
cggcgtagag 
caaagtaggt 
aataaagatg 
cctccaggta 
ggaatctccc 
ctgattttta 
cgcactacga 
ataaatcagt 
cctgggcaaa 
aataaatcat 
cgctgttttt 
aattgcgcga 
atcactttac 
cgcgttttgg 
ctgctgcaca 
aacttattcg 
acgcagcggc 
tccagcgtga 
aacctggtgt 
aatgtgcggc 
tggcttccta 
agcagtgccg 
atccgccttg 
tttaaggcgt 
aaaagaaagg 
ctgtttttgt 
ggtgcgag t a 

ggtggcaagg 

agggatgctg 
agcgaggcag 
aagtagcgaa 
ggtgcgcata 
atgctgtcgg 
atgcctacag 
atacacggtg 
tcgatgataa 
cctcgtctct 
cggcccgacg 
tgaggcgccc 
cccagagtcg 
cgctgtttct 
ctgcttctga 
gctgtgtgcc 
aggggcaaga 
gcgtagggct 
gctttcccgt 
ccctgtccgc 
cccgcgcggt 
tctctcggtg 
tccttctcta 
ccgaatggtc 
tctcgtctac 
agagcctgtc 
ggctggggag 
gccgccgtgc 
cccgtgcctc 
gccgctcccg 
tgtggttgtg 
gcgggagtcc 
tcgctctcgg 
gtcggacgtg 
catcggtctc 
cgggtcggct 
cccgccggtt 



aggagatggc 

agcgctcatg 

ggcgccagca 

gatcttggca 

gaatgcgcaa 

gaaaatcaat 

gctatatgca 

ttcagtatcc 

tgggacagag 

actttaccca 

cttctacgcg 

ccgttaactt 

gggtcataaa 

ccgtggcgcg 

gcttggtttt 

ctttctgaca 

tgagcagcaa 

aaaggcaatc 

caatggagtg 

aatcgaaaca 

tgagccagcg 

tgataccaac 

gctggatgtc 

tgtccgcacg 

tcgatagtat 

ttacggggcg 

ttccgttctt 

aaacgacagg 

ccgtggaatg 

tccgtaccat 

gtaatgaggt 

aaattgagaa 

atccacagga 

gcgagcagga 

gaaattgcat 

aatggacgat 

catccagggt 

cctgactgcg 

gcggtcaaac 

ccaccccatc 

ctgccccgcg 

ggttgtgccc 

gtggatgtgg 

tgtaagcgtc 

gctggtggtg 

ttcccgtttg 

ccccccttct 

ggtgctgagc 

ggggagcttt 

ggatgctcag 
cgcccgcgtg 
gccggggctc 
gctatcttcc 
ccctggaggg 
cgcggcccgc 
tgtcgtcctg 
agggctccgt 
ggacggggtg 
acccgtgcct 
cgacggcggc 
tcgcctcgcc 
tccttcccct 
ggacgggacc 
gggacccact 
tctctcgtgt 
cggcgctgca 
ttgcctcgcg 
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tgccctgacc 
gaggggcccg 
ccccctcccc 
acccgtggcc 
cggtcaccgg 
gagctgtggt 
gagagggctg 
agtggtcatt 
ccggccctgt 
accctggcgg 
gatgtctacc 
cctcgfctcct 
catctctcgc 
tcgccggggg 
ctcgccggct 
gacgttgcgc 
gagcccctgc 
tgtgtcgcgt 
gacgggtggc 
tcgttggtgt 
tcgccggtgt 
cggcccggtg 
gggacggagg 

gttggctttg 

tccggccgca 
cctcccgcga 
cctggtcctg 
ggtagcatat 
agtgaaactg 
ctacttggat 
tcccgggggg 
ctccggccgg 
acgccccccg 
tcgccgtgcc 
gagcctgaga 
cgacccgggg 
ggaatgagtc 
cagccgcggt 
tagt tggatc 
ccccttgcct 
gtttactttg 
aggaataatg 
taagagggac 
gcaagacgga 
tcggaggttc 
cgatgcggcg 
ggttccgggg 
ccaggagtgg 
cggacaggat 
fcfccttagtfcg 
ctaactagtt 
gcgttcagcc 
tgcacgcgcg 
aacccgttga 
gaattcccag 
accgcccgtc 
ggtcggccca 
agtaaaagtc 
cfcgtggagga 
cgcgtgcgtc 
gaaggggtgg 
tcccctctcc 
gcgtcttgcc 
ggtttttgac 
cccatccccg 
ggatgtgagt 
gtcctccccg 



ggtccgacgc 
tttcggccgc 
gctcgccgca 
gtgctgtcgg 
ggtcttgggg 
ttggagggcg 
cgtgcgaggg 
gtcccgacgg 
cgtccgfccgg 
tgggattaac 
tccctctccc 
ccctctcgcg 
gcaatggcgc 
ctggccgctg 
tcgcggactc 
ctcgctgctg 
cgcacccgcc 
cgggagcgtg 
cfcatccaggg 
ggggagtgaa 
cgcgcttctc 
cggt cgacgt 
ggagagcggg 
ccgcgfcgcgt 
tgcactctcc 
ggctctccgc 
tcccaccccc 
gcttgtctca 
cgaatggctc 
aactgtggta 
ggatgcgtgc 
gggtcgggcg 
tggcggcgac 
taccatggtg 
aacggctacc 
aggtagtgac 
cactttaaat 
aattccagct 
ttgggagcgg 
ctcggcgccc 
aaaaaat t ag 
gaafcaggacc 
ggccgggggc 
ccagagcgaa 
gaagacgatc 
gcgttattcc 
ggagtatggt 
gcctgcggct 
tgacagattg 
gtggagcgat 
acgcgacccc 
acccgagatt 
ctacactgac 
accccattcg 
taagtgcggg 
gctactaccg 
cggccctggc 
gtaacaaggt 
gcggcggcgt 
c cggg t c c eg 
9tggggtcgg 

ctcgtccggc 
tctttcccgt 
ccgtcccggg 
ccgcggctct 
gtcgcgtgtg 
ctcctgtccc 



ccgagcggtc 
ccttgccgtc 
geeggtcttt 
accccccgca 
gggggecgag 
tcccggcccc 
gaaaaggttg 
tgtggtggtc 
gaaggcgcgt 
cccgcgcgcg 
cgaggtctca 
gggttcaagt 
cgcccgagtt 
tccggtctct 
ctggctfccgc 
tgtgcttggg 
ggtgtgcggt 
tccgcctcgc 
ctcgcccccg 
tggtgctacc 
tttccgccaa 
tccggctcfcc 
taagagaggt 
gtgetcgegg 
cgttccgcgc 
cgccgccgcc 
gacgctccgc 
aagattaagc 
attaaatcag 
attctagagc 
atttatcaga 
ccggcggctt 
gacccattcg 
accaegggtg 
acatccaagg 
gaaaaataac 
cctttaacga 
ecaatagegt 

gcgggcggtc 

cctcgatgct 
agtgttcaaa 
gcggtfcctat 
attegtattg 
ageatttgee 
agatacegtc 
catgacccgc 
tgcaaagctg 
taatttgact 
atagctcttt 
ttgtctggtt 
egageggteg 
gagcaataac 
tggctcagcg 
tgatggggat 
tcataagctt 
attggatggt 
ggagcgctga 
ttccgtaggt 
ggcccgctct 
tcgcccgcgt 
tctgggtccg 
tctgacctcg 
ccggctcttc 
ggcgttcggt 
ggcttttcta 
ggctcgcccg 
gggtacctag 



tctcggtccc 
gtcgccggcc 
tttcctctct 

tgggggegge 

gggtaagaaa 
gcggccgtgg 
ccccgcgagg 
fcgttggccga 
gttggggcct 
tgtcccggtg 
ggccttctcc 
cgctcgtcga 
cacggfcgggfc 
cctgcccgac 
ccggagggtc 

gggggcccgc 

ttcgcgccgc 
ggeggctaga 
ccgacccccg 
ggtcattccc 
cccccacgcc 
ccgatgccga 
gteggagage 
acgggttttg 
gagcgcccgc 
tcctcctcct 
tcgcgcttcc 
catgcatgtc 
ttatggttcc 
taatacatgc 
tcaaaaccaa 
ggtgactcta 
aacgtctgcc 
aeggggaate 
aaggcagcag 
aatacaggac 
ggatccattg 
atattaaagt 
cgccgcgagg 
cttagctgag 
gcaggcccga 
tttgttggtt 
cgccgctaga 
aagaatgttt 
gtagttccga 
egggcagett 
aaacttaaag 
caacaeggga 
ctcgattccg 
aattccgata 
gcgtccccca 
aggtctgtga 
tgtgcctacc 
eggggattge 
gcgttgatta 
fctagtgaggc 
gaagacggtc 
gaacctgegg 
ccccgtcttg 
gtggagcgag 
tctgggaccg 
ccaccctacc 
cgtgtctacg 
cgtcggggcg 
cgttggctgg 
tcccgatgcc 
ctgtcgcgfcfc 



ttgtgaggac 
ctcgttctgc 
ccccccctct 
egggcaegta 
gteggctegg 
cggtgtcfctg 
gcaaagggaa 
ggtgcgtctg 
gccggagtgc 
tggcggtggg 
gcgcgggctc 
cctcccctcc 
tcgtcctccg 
ccccgttggc 
agggggcttc 
tgcggcctcc 
ggtcagttgg 
cgcgggtgtc 
cctgcccgtc 
tcccgcgtgg 
aacccaccac 
ggggttcggg 
tgtcccgggg 
t cggaccccg 
ccggctcacc 
ctcfccgcgct 
ttacctggtt 
taagtacgea 
tttggtcgct 
cgacgggcgc 
cccggtgagc 
gataacctcg 
ctatcaactt 
agggttcgat 
gcgcgcaaat 
tctttcgagg 
gagggcaagt 
tgctgcagtt 
cgagt caccg 
tgtcccgcgg 
gccgcctgga 
ttcggaactg 
ggtgaaattc 
tcattaatca 
ccataaacga 
ccgggaaacc 
gaattgaegg 
aacctcaccc 
tgggtggtgg 
acgaacgaga 
acttcttaga 
tgeccttaga 
ctgcgccggc 
aattattccc 
agtccctgcc 
cctcggatcg 
gaacttgact 
aaggatcatt 
tgtgtgtcct 
gtgfcctggag 
cctccgattt 
geggeggegg 
aggggeggta 
cgcgctttgc 
ggcggttgtc 
aegcttttet 
ccggcgcgga 



ccccttccgg 
tgtgtcgttc 
cctctgactg 
cgcgtccggg 
egggegggag 
cgcggtcttg 
agaggctagc 

gggggctcgt 

cgaggtgggt 
ggctccggtc 
tcggccctcc 
tccgtccttc 
cctccgcttc 
gtggtcttct 
ccggttcccc 
gcccgcccgt 
gccctggcgt 
gccgggctcc 
ccggcggfcgg 
tttgactgtc 
cctgctctcc 
atttgtgccg 
egaege t egg 
aeggggtegg 
cccggtttgt 
ctctgtcccg 
gatcctgcca 
cggccggtac 
cgctcctctc 
tgacccccct 
tccctcccgg 
ggccgatcgc 
tcgatggtag 
t ceggagagg 
tacccactcc 
ccctgtaatt 
ctggtgccag 
aaaaagctcg 
cccgtccccg 
ggcccgaagc 
taccgcagct 
aggecatgat 
ttggaccggc 
agaacgaaag 
tgccgactgg 
aaagtctttg 
aagggcacca 
ggcccggaca 
tgcatggccg 
ctctggcatg 
gggacaagtg 
tgtcegggge 
aggegegggt 
catgaacgag 
ctttgtacac 
gccccgccgg 
atctagagga 
aaaegggaga 
egcegggagg 
tgaggtgaga 
cccctccccc 
ctgctcgcgg 
cgtcgttacg 
tctcccggca 
gcgtgtgggg 
ggcctcgcgt 
ggtttaagga 
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c c c cgggggg 
cggtcgfcfccg 
cccgaggcgg 
cccgacccgc 
gggttcccgt 
cacgtgtctc 
cctctctctc 
cgtgagttcg 
tgcgtcgatg 
catcgacacfc 
cgtcggfctga 
ctcgcagggc 

gggcggttgt 

cgcgctcgcg 
gcctcgcgtc 
tgggaaccca 
gaggfctggcg 
ggttgtcggg 
gtttgggtct 
ggcgccgcgc 
gtatccccgg 
cctcggtggg 

cgtggctctt 

ccgcgggacg 
gggagggaga 
ctgtgggctg 
ccctcccgcc 
gccgggtgcc 
tgtcccccct 
attagtcagc 
gaagagccca 
gacccactcc 
tggacggfcgt 
gttgcttggg 
cgagaccgat 
tcaagagggc 
gattcaaccc 
ccccgttcct 
gcctccggcg 
gggtcggcgg 
ggcgg t gcg c 

gggggggcgg 
ggccgcgctt 
ctctcccccc 
ggcgcgaccg 
cggactgtcc 
gtcacgcgtc 
cgacccgtct 
gaaagccgcc 
cgaggcctct 
aggtggagca 
cgaagccaga 
cgacctgggt 
tttccctcag 
aatgattaga 
agaagcccgg 
ttggtaagca 
gacgctcatc 
gaagtcggaa 
aatggatggc 
ggacgggagc 
aggttaatgt 
tgcgcggaac 
gacaataacc 
atttccgtgt 
agaaacgctg 
cgaactggat 



gtcgccctgc 
ggcggctctc 
cggtcgtgtg 
gccgccggct 
gtcgttcccg 
gtfctcgtfccc 
cggggagagg 
ctcacacccg 
aagaacgcag 
tcgaacgcac 
cgatcaatcg 
caacccccca 
cggtgtggcg 
gcttcttccc 
ggcgcctccc 
ccgcgccccc 
gttgagggtg 
g t ggcgg t eg 
tgcgctgggg 
accctccggc 
tggcgttgcg 
cgccttcgcg 
cttcgtctcc 
ccgcggcgtc 
gggcctcgct 
tgcgtcccgg 
ggcctctcgg 
gtctctttcc 
ttctgaccgc 
ggaggaaaag 
gcgccgaatc 
ccggcgccgc 
gaggceggta 
aatgcagccc 
agtcaacaag 
gtgaaaccgfc 
ggcggcgcgc 
cccgacccct 
gegggegegg 
gggaccgccc 
cgcgaccggc 
cgcgtctcag 
tcgccgaatc 
gtccgcctcc 
ctctcccacc 
ccagtgcgcc 
tcccgacgaa 
tgaaacaegg 
gtggcgcaat 
ccagtccgcc 
egagegtacg 
ggaaactctg 
a t aggggega 
gatagctggc 
ggtcttgggg 
ctcgctggcg 
gaactggcgc 
agaccccaga 
teegctaagg 
gctggagcgt 
ggccgcgaat 
catgataata 
ccctatttgt 
ctgataaatg 
cgcccttatt 
gtgaaagtaa 
ctcaacagcg 



cgcccccagg 
cctcagactc 

ggggggtgga 

tgcccgattt 
tgttttfcccg 
tgctggccgg 
agggcggtgg 
aaataccgat 
etagctgega 
ttgcggcccc 
cgtcacccgc 
acccgggtcg 
cgcgcgcccg 
gctccgccgt 
ggaccgctgc 
gtggcgcccg 
tgcgtgcgcc 
aegagggecg 
gaggeggggt 
ttgtgtggag 
agggagggtt 
ccgcacgcgg 
gcttctcctt 
cgtgcgccga 
gacccgttgc 

gggttgcgtg 

ggaccccctg 
cgcccgcctc 
gacctcagat 
aaactaacca 
cccgccgcgc 
tcgtgggggg 
gcggccccgg 
aaagcgggtg 
tacegtaagg 
taagaggtaa 
gtccggccgt 
ccacccgcgc 
ggggtggtgt 
ccggccggcg 
teegggaegg 
ggcgcgccga 
ccggggccga 
egggegggeg 
cccctccgtc 
ccgggcgtcg 
gccgagcgca 
accaaggagt 
gaaggtgaag 
gagggegcac 
cgtfcaggacc 
gtggaggtcc 
aagactaatc 
gctctcgctc 
ccgaaacgat 
tggagccggg 
tgcgggatga 
aaaggt g t fcg 
agtgfcgtaac 
cgggcccata 
tcttgaagac 
afcggtttcfcfc 
ttatttttct 
cttcaataat 
cccttttttg 
aagatgctga 
gtaagatcct 



gtegggggge 
catgaccctc 
tgtctggagc 
ccgcgggtcg 
ctcccgaccc 
cctgaggcta 
tcgttggggg 
acgactctta 
gaattaatgt 
gggttcctcc 
tgcggtgggt 
ggccctccgt 
egtcgeggag 
tcccgccctc 
cfccaccagtc 
ggggtgggcg 

gaggtggtgg 
gtcggtcgcc 
cgaccgctcg 
ggagagegag 
tggcgtcccg 
ccgctagggg 
cacccgggcg 
tgegagtcac 
gtcccggctt 
tgagtaagat 
agaeggtteg 
ctcgctctct 
cagacgtggc 
ggattccctc 
gtcgeggegt. 
cccaagtcct 
cgcgccgggc 
gtaaactcca 
gaaagttgaa 
acgggtgggg 
gcccggtggt 
gtcgttcccc 

ggtggtggcg 

accggccgcc 
cegggaagge 
accacctcac 
ggaagecaga 

tgggggtggg 

gcctctctcg 
tcgcgccgtc 
eggggtegge 
etaacgegtg 
ggccccgccc 
caccggcccg 
cgaaagatgg 
gtageggtec 
gaaccatcta 
ccgacgtacg 
ctcaacctat 
cgtggaatgc 
accgaacgcc 
gttgatatag 
aactcacctg 
cccggccgtc 
gaaagggect 
agaegtcagg 
aaatacattc 
attgaaaaag 
eggcattttg 
agatcagttg 
tgagagtttt 



ggtggggccc 
cfcccccccgc 
cccctcgggc 
gtcctgtcgg 
tttttttttc 
ccccfccggtc 
actgtgccgt 
gcggt gga t c 
gaattgeagg 
eggggctacg 
gctgcgcggc 
ctcccgaagt 
cctggtctcc 
gcccgtgcac 
tttctcggtc 
cgtccgcatc 
tcggtcccct 
tgcggtggtt 
cggggtfcggc 
ggcgagaacg 
cgtccgtccg 
eggtegggge 
gtacccgctc 
ccccgggtgt 
ccctgggggg 
cctccacccc 
ccggctcgtc 
tcttcccgcg 
gacccgctga 
agtaacggcg 
gggaaatgtg 
tctgatcgag 
tegggtctte 
tctaaggcta 
aagaactttg 
tccgcgcagt 
cccggcggat 
tcttcctccc 
cgcgggcggg 
gccgggcgca 
ccggtgggga 
cccgagtgtt 
tacccgtcgc 
ggccgggccg 
gggcccggtg 
gggtcccggg 
ggcgatgtcg 
cgcgagt c ag 

gggggecega 

tctcgcccgc 
tgaactatgc 
tgacgtgcaa 
gtagctggtt 
cagttttatc 
tctcaaactt 
gagtgectag 
gggttaaggc 
acagcaggac 
ccgaatcaac 
gccgcagtcg 
cgtgatacgc 
tggcactttt 
aaatatgtat 
gaagagfcatg 
cttcctgttt 
ggtgcacgag 
cgccccgaag 



gtagggaagt 
tgccgccgtt 
gccgtggggg 
tgccggtcgt 
ctccccccca 
catctgttct 
cgtcagcacc 
actcggctcg 
acacattgat 
cctgtctgag 
tgggagtttg 
tcagacgtgt 
cccgcgcatc 
cccggtcctg 
ccgtgccccg 
tgctctggtc 
gcggccgcgg 
gtctgtgtgt 
gcggt cgccc 
gagagaggtg 
tccctccctc 
ccgtggcccc 

cggcgccggc 

tgcgagttcg 
gacccggcgt 
cgccgccctc 
ctcccgtgcc 
gctgggcgcg 
atttaagcat 
agtgaacagg 
gcgtacggaa 
gcccagcccg 
ceggagtegg 
aataceggea 
aagagagagt 
ccgcccggag 
Ctttcccgct 
cgcgtccggc 
geegggggtg 
cttccaccgt 
aggtggctcg 
acagccctcc 
cgcgctctcc 
cccctcccac 

gggggcgggg 

gggacegteg 
gctacccacc 
gggctcgtcc 
ggtgggatcc 
cgcgccgggg 
ttgggcaggg 
ateggtegtc 
ccctccgaag 
eggtaaageg 
taaatgggta 
tgggecaett 
gcccgatgcc 
ggtggccatg 
tagecctgaa 
gaacggaacg 
ctatttttat 
eggggaaatg 
ccgctcatga 
agtattcaac 
ttgctcaccc 
fcgggfctacat 
aacgttttcc 



12120 
12180 
12240 
12300 
12360 
12420 
12480 
12540 
12600 
12660 
12720 
12780 
12840 
12900 
12960 
13020 
13080 
13140 
13200 
13260 
13320 
13380 
13440 
13500 
13560 
13620 
13680 
13740 
13800 
13860 
13920 
13980 
14040 
14100 
14160 
14220 
14280 
14340 
14400 
14460 
14520 
14580 
14640 
14700 
14760 
14820 
14880 
14940 
150O0 
15060 
15120 
15180 
15240 
15300 
15360 
15420 
15480 
15540 
15600 
15660 
15720 
15780 
15840 
15900 
15960 
16020 
16080 



-88- 



aatgatgagc 
gcaagagcaa 
agtcacagaa 
aaccatgagt 
gctaaccgct 
ggagctgaat 
aacaacgttg 
aatagactgg 
tggctggttt 
agcactgggg 
ggcaactatg 
ttggtaactg 
ttaatttaaa 
acgtgagttt 
agatcctttt 
ggtggtttgt 
cagagcgcag 
gaactctgta 
cagtggcgat 
gcagcggtcg 
caccgaactg 
aaggcggaca 



acttttaaag 
ctcggtcgcc 
aagcatctta 
gataacactg 
tttttgcaca 
gaagccatac 
cgcaaactat 
atggaggcgg 
attgctgata 
ccagatggta 
gatgaacgaa 
tcagaccaag 
aggatctagg 
tcgttccact 
tttctgcgcg 
ttgccggatc 
ataccaaata 
gcaccgccta 
aagtcgtgtc 
ggctgaacgg 
agatacctac 
ggtatccggt 



ttctgctatg 
gcatacacta 
cggatggcat 
cggccaactt 
acatggggga 
caaacgacga 
taactggcga 
ataaagttgc 
aatctggagc 
agccctcccg 
atagacagat 
tttactcata 
tgaagatcct 
gagcgtcaga 
taatctgctg 
aagagctacc 
ctgtccttct 
catacctcgc 
ttaccgggtt 

ggggttcgtg 

agcgtgagct 
aagcggcagg 



t ggcgcggt a 
ttctcagaat 
gacagtaaga 
acttctgaca 
tcatgtaact 
gcgtgacacc 
actacttact 
aggaccactt 
cggtgagcgt 
tatcgtagtt 
cgctgagata 
tatactttag 
ttttgataat 
ccccgtagaa 
cttgcaaaca 
aactcttttt 
agtgtagccg 
tctgctaatc 
ggactcaaga 
cacacagccc 
atgagaaagc 
gtcggaacag 



ttatcccgtg 
gacttggttg 
gaattatgca 
acgatcggag 
cgccttgatc 
acgatgcctg 
ctagcttccc 
ctgcgctcgg 
gggtctcgcg 
atctacacga 
ggtgcctcac 
attgatttaa 
ctcatgacca 
aagatcaaag 
aaaaaaccac 
ccgaaggtaa 
tagttaggcc 
ctgttaccag 
cgatagttac 
agcttggagc 
gccacgcttc 
gaga 



ttgacgccgg 
agtactcacc 
gtgctgccat 
gaccgaagga 
gttgggaacc 
cagcaatggc 
ggcaacaatt 
cccttccggc 
gtatcattgc 
cggggagtca 
tgattaagca 
aacttcattt 
aaatccctta 
gatcttcttg 
cgctaccagc 
ctggcttcag 
accacttcaa 
tggctgctgc 
cggataaggc 
gaacgaccta 
cgaagggaga 



<210> 119 

<211> 2814 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pIiITMUS38 Plasmid 



<400> 119 

gttaactacg 

tttctaaata 

ataatattga 

ttttgcggca 

tgctgaagat 

gatccttgag 

gctatgtggc 

acactattct 

tggcatgaca 

caacttactt 

gggggatcat 

cgacgagcgt 
tggcgaacta 
agttgcagga 
tggagccggt 
ctcccgtatc 
acagatcgct 
ctcatatata 
aagattgtat 
aatttttgtt 
aaatcaaaag 
ctattaaaga 
ccactacgtg 
aatcggaacc 
gaaaggaagg 
cgctgcgcgt 
atctaggtga 
ttccactgag 
ctgcgcgtaa 
ccggatcaag 
ccaaatactg 
ccgcctacat 
tcgtgtctta 
tgaacggggg 
tacctacagc 



tcaggtggca 
cattcaaata 
aaaaggaaga 
ttttgccttc 
cagttgggtg 
agttttcgcc 
gcggtattat 
cagaatgact 
gt aagagaat 
ctgacaacga 
gtaactcgcc 
ga caeca cga 
cttactctag 
ccacttctgc 
gagcgfcgggt 
gtagttatct 
gagataggtg 
ctttagattg 
aagcaaatat 
aaatcagctc 
aatagecega 
acgtggactc 
aaccatcacc 
c t aaagggag 
gaagaaagcg 
aaccaccaca 
agatcctttt 
cgtcagaccc 
tetgetgett 
agctaccaac 
ttcttctagt 
acctcgctct 
ccgggttgga 
gttcgtgcac 
gtgagctatg 



ettttegggg 
tgtatccget 
gtatgagtat 
ctgtttttgc 
cacgagtggg 
ccgaagaacg 
cccgtgttga 
tggttgagta 
tatgcagtgc 
teggaggace 
ttgatcgttg 
tgcctgtagc 
cttcccggca 
gctcggccct 
etcgeggtat 
acacgaeggg 
cctcactgat 
atttaccccg 
ttaaattgta 
attttttaac 
gatagggttg 
caaegtcaaa 
caaatcaagt 
cccccgattt 
aaaggagegg 
cccgccgcgc 
tgataatctc 
cgt agaaaag 
gcaaacaaaa 
tctttttccg 
gtagcegtag 
gctaatcctg 
ctcaagacga 
acagcccagc 
agaaagegee 



aaatgtgcgc 
catgagacaa 
tcaacatttc 
tcacccagaa 
ttacatcgaa 
ttctccaatg 
cgccgggcaa 
ctcaccagtc 
tgccataacc 
gaaggagcta 
ggaaceggag 
aatggcaaca 
acaattaata 
tccggctggc 
cattgeagea 
gagtcaggca 
taagcattgg 
gttgataatc 
aacgttaata 
caataggecg 
agtgttgttc 
gggcgaaaaa 
tttttggggt 
agagcttgac 
gegctaggge 
ttaatgegee 
atgaccaaaa 
atcaaaggat 
aaaccaccgc 
aaggtaactg 
ttaggccacc 
ttaccagtgg 
tagttacegg 
ttggagcgaa 
acgcttcccg 



ggaaccccta 
taaccctgat 
cgtgtcgccc 
acgctggtga 
ctggatctca 
atgagcactt 
gagcaactcg 
acagaaaagc 
atgagtgata 
acegcttttt 
c t gaa t gaag 
aegt tgegea 
gactggatgg 
tggtttattg 
ctggggccag 
actatggatg 
taactgtcag 
agaaaagece 
ttttgttaaa 
aaatcggcaa 
cagt t tggaa 
ccgtctatca 
cgaggtgccg 
ggggaaagcg 
gctggcaagt 
gctacagggc 
tcccttaacg 
cttcttgaga 
taccageggt 
gcttcagcag 
acttcaagaa 
ctgctgccag 
ataaggegea 
cgacctacac 
aagggagaaa 



tttgtttatt 
aaatgettea 
ttattccctt 
aagtaaaaga 
acageggtaa 
ttaaagttct 
gtcgccgcat 
atettaegga 
acactgcggc 
tgcacaacat 
ccataccaaa 
aactattaac 
aggeggataa 
ctgataaatc 
atggtaagcc 
aacgaaatag 
accaagttta 
caaaaac agg 
attcgegtta 
aatccct tat 
caagagtcca 
gggcgatggc 
taaagcacta 
aacgtggcga 
gtageggtea 
gegtaaaagg 
tgagttttcg 
tccttttttt 
ggtttgtttg 
agegcagata 
ctctgtagca 
tggcgataag 
geggteggge 
cgaactgaga 
ggeggacagg 



16140 
16200 
16260 
16320 
16380 
16440 
16500 
16560 
16620 
16680 
16740 
16800 
16860 
16920 
16980 
17040 
17100 
17160 
17220 
17280 
17340 
17384 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1O80 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 



-89- 



tatccggtaa 
gcctggtatc 
tgatgctcgt 
ttcctggcct 
accccaggct 
acaatttcac 
ctagtggggc 
tccacgaatt 
ctctagtcaa 
actgggaaaa 
gctggcgtaa 
atggcgaatg 



gcggcagggt 
tttatagtcc 
caggggggcg 
tttgctggcc 
ttacacttta 
acaggaaaca 
ccgtgcaatt 
cgctagcttc 
ggccttaagt 
ccctggcgtt 
tagcgaagag 
gcgcttcgct 



cggaac agga 
tgtcgggttt 
gagcctatgg 
ttttgctcac 
tgcttccggc 
gctatgacca 
gaagccggct 
ggccgtgacg 
gagtcgtatt 
acccaactta 
gcccgcaccg 
tggtaataaa 



gagcgcacga 
cgccacctct 
aaaaacgcca 
atgtaatgtg 
tcgtatgttg 
tgattacgcc 
ggcgccaagc 
cgfcctccgga 
acggactggc 
atcgccttgc 
atcgcccttc 
gcccgcttcg 



gggagcttcc 
gacttgagcg 
gcaacgcggc 
agttagctca 
tgtggaattg 
aagctacgta 
ttctctgcag 
tgtacaggca 
cgtcgtttta 
agcacatccc 
ccaacagttg 
gcgggctttt 



agggggaaac 
tcgattttfcg 
ctttttacgg 
ctcattaggc 
tgagcggata 
atacgactca 
gatatctgga 
tgcgtcgacc 
caacgtcgtg 
cctttcgcca 
cgcagcctga 
tttt 



<210> 120 
<211> 2847 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pLIT38attB Plasmid 



<400> 120 

gttaactacg 

tttctaaata 

ataatattga 

ttttgcggca 

tgctgaagat 

gatccttgag 

gctatgtggc 

acactattct 

tggcatgaca 

caacttactt 

ggggg a t C at 

cgacgagcgt 

tggcgaacta 

agttgcagga 

tggagccggt 

ctcccgtatc 

acagatcgct 

ctcatatata 

aagat tgtat 

aatttttgtt 

aaatcaaaag 

ctattaaaga 

ccactacgtg 

aatcggaacc 

gaaaggaagg 

cgctgcgcgt 

atctaggtga 

ttccactgag 

ctgcgcgtaa 

ccggatcaag 

ccaaatactg 

ccgcctacat 

tcgtgtctta 

tgaacggggg 

tacctacagc 

tatccggtaa 

gcctggtatc 

tgatgctcgt 

ttcctggcct 

accccaggct 

acaatttcac 

ctagtggggc 

tgctttttta 

acgcgtctcc 

attacggact 



tcaggtggca 
cattcaaata 
aaaaggaaga 
t tttgccttc 
cagttgggtg 
agttttcgcc 
gcggtattat 
cagaatgact 
gtaagagaat 
ctgacaacga 
gtaactcgcc 
gacaccacga 
cttactctag 
ccacttctgc 
gagcgtgggt 
gtagttatct 
gagataggtg 
ctttagattg 
aagcaaatat 
aaatcagctc 
aatagcccga 
acgtggactc 
aaccatcacc 
ctaaagggag 
gaagaaagcg 
aaccaccaca 
agatcctttt 
cgtcagaccc 
tctgctgctt 
agctaccaac 
ttcttctagt 
acctcgctct 
ccgggttgga 
gttcgtgcac 
gtgagctatg 
gcggcagggt 
tttatagtcc 
caggggggcg 
tttgctggcc 
ttacacttta 
acaggaaaca 
ccgtgcaatt 
tactaacttg 
ggatgtacag 
ggccgtcgtt 



cttttcgggg 
tgtatccgct 
gtatgagtat 
ctgtttttgc 
cacgagtggg 
ccgaagaacg 
cccgtgttga 
tggttgagta 
tatgcagtgc 
tcggaggacc 
ttgatcgttg 
tgcctgtagc 
cttcccggca 
gctcggccct 
ctcgcggtat 
acacgacggg 
cctcactgat 
atttaccccg 
ttaaat tgta 
attttttaac 
gatagggttg 
caacgtcaaa 
caaatcaagt 
cccccgattt 
aaaggagcgg 
cccgccgcgc 
tgataatctc 
cgtagaaaag 
gcaaacaaaa 
tctttttccg 
gtagccgtag 
gctaatcctg 
ctcaagacga 
acagcccagc 
agaaagcgcc 
cggaacagga 
tgtcgggttt 
gagcctatgg 
ttttgctcac 
tgcttccggc 
gctatgacca 
gaagccggct 
agcgaaatct 
gcatgcgtcg 
ttacaacgtc 



aaatgtgcgc 
catgagacaa 
tcaacatttc 
tcacccagaa 
ttacatcgaa 
ttctccaatg 
cgccgggcaa 
ctcaccagtc 
tgccataacc 
gaaggagcta 
ggaaccggag 
aatggcaaca 
acaattaata 
tccggctggc 
cattgcagca 
gagtcaggca 
taagcattgg 
gttgataatc 
aacgttaata 
caataggccg 
agtgttgttc 
gggcgaaaaa 
tttttggggt 
agagcttgac 
gcgctagggc 
ttaatgcgcc 
atgaccaaaa 
atcaaaggat 
aaaccaccgc 
aaggtaactg 
ttaggccacc 
ttaccagtgg 
tagttaccgg 
t tggagcgaa 
acgcttcccg 
gagcgcacga 
cgccacctct 
aaaaacgcca 
atgtaatgtg 
tcgtatgttg 
tgattacgcc 
ggcgccaagc 
ggatccacga 
accctctagt 
gtgactggga 



ggaaccccta 
taaccctgat 
cgtgtcgccc 
acgctggtga 
ctggatctca 
atgagcactt 
gagcaactcg 
acagaaaagc 
atgagtgata 
accgcttttt 
ctgaatgaag 
acgttgcgca 
gactggatgg 
tggtttattg 
ctggggccag 
actatggatg 
taactgtcag 
agaaaagccc 
ttttgttaaa 
aaatcggcaa 
cagtttggaa 
ccgtctatca 
cgaggtgccg 
ggggaaagcg 
gctggcaagt 
gctacagggc 
tcccttaacg 
cttcttgaga 
taccagcggt 
gcttcagcag 
acttcaagaa 
ctgctgccag 
ataaggcgca 
cgacctacac 
aagggagaaa 
gggagcttcc 
gacttgagcg 
gcaacgcggc 
agttagctca 
tgtggaattg 
aagctacgta 
ttctctgcag 
attcgctagc 
caaggcctta 
aaaccctggc 



tttgtttatt 
aaatgcttca 
ttattccctt 
aagtaaaaga 
acagcggtaa 
ttaaagttct 
gtcgccgcat 
atcttacgga 
acactgcggc 
tgcacaacat 
ccataccaaa 
aactattaac 
aggcggataa 
ctgataaatc 
atggtaagcc 
aacgaaatag 
accaagttta 
caaaaacagg 
attcgcgtta 
aatcccttat 
caagagtcca 
gggcgatggc 
taaagcacta 
aacgtggcga 
gtagcggtca 
gcgt aaaagg 
tgagttttcg 
tccttttttt 
ggtttgtttg 
agcgcagata 
ctctgtagca 
tggcgataag 
gcggtcgggc 
cgaactgaga 
ggcggacagg 
agggggaaac 
tcgatttttg 
ctttttacgg 
ctcattaggc 
tgagcggata 
atacgactca 
gattgaagcc 
ttcggccgtg 
agtgagtcgt 
gttacccaac 



2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2814 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 



-90- 

ttaatcgcct tgcagcacat ccccctttcg ccagctggcg taatagcgaa gaggcccgca 276 0 
ccgatcgccc ttcccaacag ttgcgcagcc tgaatggcga atggcgcttc gcttggtaat 2 82 0 
aaagcccgct tcggcgggct ttttttt 2 84 7 

<210> 121 
<211> 4223 
<212> DNA 

<213> Artificial Sequence 



<220> 

<22 3> pLIT3 8attBBSRpolyA2 Plasraid 



<400> 121 

accatgaaaa 

aagattacaa 

acaggagaaa 

gcagaagcca 

gtagctgtta 

tgtggtatgt 

atgaatggca 

aafctaaaagt 

ttctaccggc 

catgcccccg 

aaggaacctt 

tctaaggtaa 

tgtgtatttt 

aatgaggaaa 

gactctcaac 

ccttcagaat 

tttgctattt 

tattctgtaa 

actccacaca 

agctttttaa 

gatcataatc 

cctccccctg 

agcttataat 

tttcgctcaa 

cggcttcaat 

tcatagctgt 

ggaagcataa 

agcaaaaggc 

taggctccgc 

cccgacagga 

tgttccgacc 

gctttctcat 

gggctgtgtg 

tcttgagtcc 

gattagcaga 

cggctacact 

aaaaagagtt 

tgtttgcaag 

ttctacgggg 

attatcaaaa 

cggcgggtgt 

ctcctttcgc 

tcgggggcfcc 

tgatttgggt 

gacgttggag 

ccctatctcg 

aaaaaatgag 

aatttaaata 

ggtaaafccaa 

agtgaggcac 

gtcgfcgtaga 

ccgcgagacc 

gccgagcgca 

cgggaagcta 



catttaacat 
tgctttatga 
tcatttcggc 
ttgcgattgg 
gacaccctta 
gtagggagtt 
agttagfccaa 
tttaccatac 
agtgcaaatc 
aactgcagga 
acttctgtgg 
atataaaatt 
agattccaac 
acctgttttg 
attctactcc 
tgctaagttt 
acaccacaaa 
cctttataag 
ggcatagagt 
tttgtaaagg 
agccatacca 
aacctgaaac 
ggttacaaat 
gttagtataa 
tgcacgggcc 
tfccctgtgtg 
agtgtaaagc 
cagcaaaagg 
ccccctgacg 
ctataaagat 
ctgccgctta 
agctcacgct 
cacgaacccc 
aacccggtaa 
gcgaggtatg 
agaagaacag 
ggtagctctt 
cagcagatta 
tctgacgctc 
aggatcttca 
ggtggttacg 
tttcttccct 
cctttagggt 
gatggttcac 
tccacgttct 
ggctattctt 
ctgatttaac 
tttgcttata 
tctaaagtat 
ctatctcagc 
taactacgat 
cacgctcacc 
gaagtggtcc 
gagtaagtag 



ttctcaacaa 
ggat aat aaa 
agtacatatt 
tagtgcagtt 
ttctgacgaa 
gatttcagac 
aactacgatt 
caagcttggc 
cgtcggcatc 
gtggggaggc 
tgtgacataa 
tttaagtgta 
ctatggaact 
ctcagaagaa 
tccaaaaaag 
tttgagtcat 
ggaaaaagct 
taggcataac 
gtctgctatt 
ggttaataag 
catttgtaga 
ataaaatgaa 
aaagc aa t ag 
aaaagcaggc 
ccactagtga 
aaattgttat 
ct:ggggtgcc 
ccaggaaccg 
age a t c a c aa 
accaggegtt 
ccggatacct 
gtaggtatct 
ccgttcagcc 
gacacgactt 
taggcggtgc 
tatttggtat 
gatceggcaa 
cgcgcagaaa 
agtggaacga 
cctagatcct 
cgcagcgtga 
tcctttctcg 
tccgatttag 
gtagtgggcc 
ttaatagtgg 
ttgatttata 
aaaaatttaa 
caatcttcct 
atatgagtaa 
gatctgtcta 
aegggaggge 
ggctccagat 
tgcaacttta 
ttcgccagtfc 



gatctagaat 
catcatgtgg 
gaagegtata 
tcgaatggac 
gtagatagaa 
tatgeaccag 
gaagaactca 
tgctgcctga 
c aggaa acca 
acgatggccg 
ttggacaaac 
taatgtgtta 
gatgaatggg 
atgecatcta 
aagagaaagg 
gctgtgttta 
geactgetat 
agttataatc 
aataactatg 
gaatatttga 
ggttttactt 
tgcaattgtt 
catcacaaat 
ttcaatcctg 
gtegtattae 
ccgctcacaa 
taatgagtga 
fcaaaaaggee 
aaatcgaege 
tccccctgga 
gtccgccttt 
cagttcggtg 
cgaccgctgc 
atcgccactg 
tacagagttc 
ctgcgctctg 
acaaaccacc 
aaaaggatct 
aaactcacgt 
tttacgcgcc 
ccgctacact 
ccacgttcgc 
tgetttaegg 
atcgccctga 
actcttgttc 
agggattfctg 
cgcgaatttt 
gtttttgggg 
acttggtctg 
tttcgttcat 
ttaccatctg 
ttatcagcaa 
tccgcctcca 
aatagtttgc 



tagtagaagt 
gageggcaat 
taggacgagt 
aaaaggattt 
gtattcgagt 
attgttttgt 
ttccactcaa 
ggctggacga 
geageggcta 
ctttggtccg 
tacctacaga 
aactactgat 
agcagtggtg 
gtgatgatga 
tagaagaccc 
gtaatagaac 
acaagaaaat 
ataacatact 
ctcaaaaatt 
tgtatagtgc 
gctttaaaaa 
gttgttaact 
ttcacaaata 
cagagaagct 
gtagcttggc 
ttccacacaa 
gctaactcac 
gcgttgctgg 
tcaagtcaga 
agctccctcg 
ctcccttcgg 
taggtegtte 
gccttatccg 
gcagcagcca 
ttgaagtggt 
ctgaagccag 
gctggtagcg 
caagaagatc 
taagggattt 
ctgtagcggc 
tgccagcgcc 
tttccccgtc 
cacctcgacc 
tagacggttfc 
caaactggaa 
ccgatttegg 
aacaaaatat 
cttttctgat 
acagttacca 
ccatagttgc 
gccccagtgc 
taaaccagcc 
tccagtctat 
gcaacgttgt 



agegacagag 
tegtacgaaa 
aactgtttgt 
tgacacgatt 
ggtaagtcct 
gttaatagaa 
atatacccga 
cctcgcggag 
tccgcgcatc 
gatctttgtg 
gatttaaagc 
tctaattgtt 
gaatgccttt 
ggctactget 
caaggacttt 
tettgettge 
tatggaaaaa 
gttttttctt 
gtgtaccttt 
cttgactaga 
acctcccaca 
tgtttattgc 
aagatccaga 
tggcgccagc 
gtaatcatgg 
catacgagcc 
attacatgtg 
cgtttttcca 
ggtggcgaaa 
tgcgctctcc 
gaagcgtggc 
gctccaagct 
gtaactatcg 
ctggtaacag 
ggectaacta 
ttaccttegg 
gtggtttttt 
ctttgatctt 
tggtcatgag 
geattaageg 
ctagcgcccg 
aagctctaaa 
ccaaaaaact 
ttcgcccttt 
caacactcaa 
cctattggtt 
taacgtttac 
tatcaacegg 
atgettaate 
ctgactcccc 
tgcaatgata 
ageeggaagg 
taattgttgc 
tgecattget' 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 



-91- 



acaggcatcg 
cgatcaaggc 
cctccgatcg 
ctgcataatt 
tcaaccaagt 
acacgggata 
tcttcggggc 
actcgtgcac 
aaaacaggaa 
ctcatactct 
ggatacatat 
cgaaaagtgc 
aagcgaagcg 
cctcttcgct 
taacgccagg 
cacttaaggc 
cgaagctagc 



tggtgtcacg 
gagttacatg 
ttgtcagaag 
ctcttactgt 
cattctgaga 
ataccgcgcc 
gaaaactcfcc 
ccaactgatc 
ggcaaaatgc 
tcctttttca 
ttgaatgtat 
cacctgacgt 
ccattcgcca 
attacgccag 
gttttcccag 
cttgactaga 
gaattcgtgg 



ctcgtcgttt 
atcccccatg 
taagttggcc 
catgccatcc 
atagtgtatg 
acatagcaga 
aaggatctta 
ttcagcatcfc 
cgcaaaaaag 
atattattga 
ttagaaaaat 
agttaacaaa 
ttcaggctgc 
ctggcgaaag 
tcacgacgtt 

ate 



ggtatggctt 
ttgtgcaaaa 
gcagtgttat 
gtaagatget 
cggcgaccga 
actttaaaag 
ccgctgttga 
tttaefcttea 
ggaataaggg 
agcatttatc 
aaacaaatag 
aaaaagcccg 
gcaactgttg 

ggggatgtgc 

gtaaaacgac 
atgcctgtac 



cattcagctc 
aagcggttag 
cactcatggt 
tttctgtgac 
gtfcgctcttg 
tgctcatcat 
gatccagttc 
ccagcgtttc 
egacaeggaa 
agggttattg 
gggttccgcg 
ccgaagcggg 
ggaagggega 
tgeaaggega 
ggccagtccg 
ateeggagae 



cggttcccaa 
ctccttcggt 
tatggcagca 
tggtgagtac 
cccggcgtca 
tggagaacgt 
gatgtaaccc 
tgggtgagca 
atgttgaata 
tctcatgagc 
cacatttccc 
ctttattacc 
t egg t gcggg 
ttaagttggg 
taatacgact 
gcgtcacggc 



3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4223 



<210> 122 
<211> 2686 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pUC18 Plasmid 



<400> 122 

tcgcgcgttt 

cagcttgtct 

ttggcgggtg 

accatatgeg 

attcgecatt 

tacgccagct 

tttcccagtc 

actctagagg 

gtgtgaaatt 

aaagcctggg 

gctttccagt 

agaggeggtt 

gtcgttcggc 

gaatcagggg 

eg t aaaaagg 

aaaaatcgac 

tttccccctg 

ctgtccgcct 

etcagttegg 

cccgaccgct 

ttatcgccac 

gctacagagt 

atctgcgctc 

aaacaaacca 

aaaaaaggat 

gaaaactcac 

cttttaaatt 

gacagttacc 

tccatagttg 

ggccccagtg 

ataaaccagc 

atccagtcta 

cgcaacgttg 

tcattcagct 

aaagcggtta 

tcactcatgg 

ttttctgtga 

agttgctctt 

gtgetcatea 

agatccagtt 



eggtgatgae 
gtaageggat 
teggggctgg 
gtgtgaaata 
caggctgcgc 
ggcgaaaggg 
acgacgttgt 
atccccgggt 
gttatccget 
gtgcctaatg 
egggaaaect 
tgcgtattgg 
tgeggegage 
ataaegcagg 
ccgcgttgct 
gctcaagtca 
gaagctccct 
ttctcccttc 
tgtaggtcgt 
gcgccttatc 
tggcagcagc 
tcttgaagtg 
tgetgaagee 
ccgctggtag 
ctcaagaaga 
gttaagggat 
aaaaatgaag 
aatgcttaat 
cctgactccc 
ctgeaatgat 
cagceggaag 
ttaattgttg 
ttgccattgc 
ccggttccca 
gctccttcgg 
ttatggcagc 
ctggtgagta 
gcccggcgtc 
ttggaaaacg 
cgatgtaacc 



ggtgaaaacc 
geegggagea 
cttaactatg 
ccgcacagat 
aactgttggg 
ggatgtgctg 
aaaaegaegg 
accgagctcg 
cacaattcca 
agtgagctaa 
gtcgtgccag 
gcgctcttcc 
ggtatcagct 
aaagaacatg 
ggcgtttttc 
gaggtggcga 
cgtgcgctct 
gggaagcgtg 
tcgctccaag 
eggtaactat 
cactggtaac 
gtggcctaac 
agttaccttc 
cggtggtttt 
tcctttgatc 
tttggtcatg 
ttttaaatca 
cagtgaggca 
cgtcgtgtag 
accgcgagac 
ggccgagcgc 
cegggaaget 
tacaggcatc 
acgatcaagg 
tcctccgatc 
actgeataat 
ctcaaccaag 
aataegggat 
ttcttcgggg 
cactcgtgca 



tctgacacat 
gacaagcccg 
eggcatcaga 
gegtaaggag 
aagggegate 
caaggegatt 
ccagtgccaa 
aattcgtaat 
cacaacatac 
ctcacattaa 
ctgcattaat 
gcttcctcgc 
cactcaaagg 
tgagcaaaag 
cataggctcc 
aacccgacag 
cctgttccga 
gcgctttctc 
ctgggctgtg 
cgtcttgagt 
aggattagca 
tacggctaca 
ggaaaaagag 
tttgtttgca 
ttttctaegg 
agattatcaa 
atctaaagta 
cctatctcag 
ataactacga 
ccacgctcac 
agaagtggtc 
agagtaagta 
gtggtgtcac 
cgagttacat 
gttgtcagaa 
tctcttactg 
tcattctgag 
aataccgcgc 
cgaaaactct 
cccaactgat 



gcagctcccg 

teagggegeg 

gcagattgta 

aaaatacege 

ggtgcgggcc 

aagttgggta 

gettgeatge 

catggtcata 

gagceggaag 

ttgcgttgcg 

gaateggeca 

tcactgactc 

eggtaatacg 

gecagcaaaa 

gcccccctga 

gactataaag 

ccctgccgct 

atagctcacg 

tgcacgaacc 

ccaacccggt 

gagegaggta 

c t agaaggac 

ttggtagctc 

agcagcagat 

ggtctgaege 

aaaggatctt 

tatatgagta 

cgatctgtct 

tacgggaggg 

cggctccaga 

ctgcaacttt 

gttcgccagt 

getegtegtt 

gatcccccat 

gtaagttggc 

tcatgccatc 

aatagtgtat 

cacatagcag 

caaggatctt 

cttcagcatc 



gagaeggtea 
teagegggtg 
ctgagagtgc 
atcaggcgcc 
tettegctat 
aegecagggt 
ctgeaggteg 
gctgtttcct 
cataaagtgt 
ctcactgccc 
aegegegggg 
gctgcgctcg 
gttatccaca 
ggecaggaac 
cgagcatcac 
ataccaggcg 
taceggatae 
ctgtaggtat 
ccccgttcag 
aagacacgac 
tgtaggcggt 
agtatttggt 
ttgatcegge 
tacgegcaga 
tcagtggaac 
cacctagatc 
aacttggtct 
atttegttea 
cttaccatct 
tttatcagca 
atccgcctcc 
taatagtttg 
tggtatggct 
gttgtgcaaa 
cgcagtgtta 
cgtaagatgc 
gcggcgaccg 
aactttaaaa 
accgctgttg 
ttttactttc 



60 

120 

180 

240 

3O0 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 



-92- 



accagcgttt ctgggtgagc aaaaacagga aggcaaaatg 
gcgacacgga aatgttgaat actcatactc ttcctttttc 
cagggttatt gtctcatgag cggatacata tttgaatgta 
ggggttccgc gcacatttcc ccgaaaagtg ccacctgacg 
atgacattaa cctataaaaa taggcgtatc acgaggccct 

<210> 123 
<211> 8521 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pCXeGFPattB(6xHS4)2 Plasmid 
<400> 123 

tacggggcgg gggatccact agttattaat agtaatcaat tacggggtca ttagttcata 60 
gcccatatat ggagttccgc gttacataac ttacggtaaa tggcccgcct ggctgaccgc 120 
ccaacgaccc ccgcccattg acgtcaataa tgacgtatgt tcccatagta acgccaatag 180 
ggactttcca ttgacgtcaa tgggtggact atttacggta aactgcccac ttggcagtac 240 
atcaagtgta tcafcatgcca agtacgcccc ctattgacgt caatgacggt aaatggcccg 3 00 
cctggcatta tgcccagtac atgaccttat gggactttcc tacttggcag tacatctacg 3 60 
tattagtcat cgctattacc atgggtcgag gtgagcccca cgttctgctt cactctcccc 420 
atctcccccc cctccccacc cccaattttg tafcfctattta ttttttaatt attttgtgca 480 
gcgatggggg cggggggggg gggggcgcgc gccaggcggg gcggggcggg gcgaggggcg 540 
gggcggggcg aggcggagag gtgcggcggc agccaatcag agcggcgcgc tccgaaagtt 600 
tccttttatg gcgaggcggc ggcggcggcg gccctataaa aagcgaagcg cgcggcgggc 660 
gggagtcgct gcgttgcctt cgccccgtgc cccgctccgc gccgcctcgc gccgcccgcc 720 
ccggctctga ctgaccgcgt tactcccaca ggtgagcggg cgggacggcc cttctcctcc 7 80 
gggctgtaat tagcgcttgg tttaatgacg gctcgtttct tttctgtggc tgcgtgaaag 840 
ccttaaaggg ctccgggagg gccctttgtg cgggggggag cggctcgggg ggtgcgtgcg 900 
tgtgtgtgtg cgtggggagc gccgcgtgcg gcccgcgctg cccggcggct gtgagcgctg 960 
cgggcgcggc gcggggcttt gtgcgctccg cgtgtgcgcg aggggagcgc ggccgggggc 102 0 
ggtgccccgc ggtgcggggg ggctgcgagg ggaacaaagg ctgcgtgcgg ggtgtgtgcg 1080 
tgggggggtg agcagggggt gtgggcgcgg cggtcgggct gtaacccccc cctgcacccc 114 0 
cctccccgag ttgctgagca cggcccggct tcgggtgcgg ggctccgtgc ggggcgtggc 1200 
gcggggctcg ccgtgccggg cggggggfcgg cggcaggtgg gggtgccggg cggggcgggg 1260 
ccgcctcggg ccggggaggg ctcgggggag gggcgcggcg gccccggagc gccggcggct 132 0 
gtcgaggcgc ggcgagccgc agccattgcc ttttatggta atcgtgcgag agggcgcagg 138 0 
gacttccttt gtcccaaatc tggcggagcc gaaatctggg aggcgccgcc gcaccccctc 144 0 
tagcgggcgc gggcgaagcg gtgcggcgcc ggcaggaagg aaatgggcgg ggagggcctt 1500 
cgtgcgtcgc cgcgccgccg tccccttctc catctccagc ctcggggctg ccgcaggggg 1560 
acggctgcct tcggggggga cggggcaggg cggggttcgg cttctggcgt gtgaccggcg 162 0 
gctctagagc ctctgctaac catgttcatg ccttcttctt tttcctacag ctcctgggca 1680 
acgtgctggt tgttgtgctg tctcatcatt ttggcaaaga attcgccacc atggtgagca 174 0 
agggcgagga gctgttcacc ggggtggtgc ccatcctggt cgagctggac ggcgacgtaa 180 0 
acggccacaa gttcagcgtg tccggcgagg gcgagggcga tgccacctac ggcaagctga 1860 
ccctgaagtt catctgcacc accggcaagc tgcccgtgcc ctggcccacc ctcgtgacca 192 0 
ccctgaccta cggcgtgcag tgcttcagcc gctaccccga ccacatgaag cagcacgact 198 0 
tcttcaagtc cgccatgccc gaaggcfcacg tccaggagcg caccatcttc ttcaaggacg 2 04 0 
acggcaacta caagacccgc gccgaggtga agttcgaggg cgacaccctg gtgaaccgca 2100 
tcgagctgaa gggcatcgac ttcaaggagg acggcaacat cctggggcac aagctggagt 216 0 
acaactacaa cagccacaac gtctatatca tggccgacaa gcagaagaac ggcatcaagg 222 0 
tgaacttcaa gatccgccac aacatcgagg acggcagcgt gcagctcgcc gaccactacc 22 80 
agcagaacac ccccatcggc gacggccccg tgctgctgcc cgacaaccac tacctgagca 234 0 
cccagtccgc cctgagcaaa gaccccaacg agaagcgcga tcacatggtc ctgctggagt 2400 
tcgtgaccgc cgccgggatc actctcggca tggacgagct gtacaagtaa gaafctcactc 2460 
ctcaggtgca ggctgcctat cagaaggtgg tggctggtgt ggccaatgcc ctggctcaca 2520 
aataccactg agatcttttt ccctctgcca aaaattatgg ggacatcatg aagccccttg 2580 
agcatctgac ttctggctaa taaaggaaat ttattttcat tgcaatagtg tgtfcggaatt 264 0 
fctrttgtgtcfc ctcactcgga aggacatatg ggagggcaaa tcatttaaaa catcagaatg 2700 
agtatttggt ttagagtttg gcaacatatg ccatatgctg gctgccatga acaaaggfcgg 276 0 
ctataaagag gtcatcagta tatgaaacag ccccctgctg tccattcctt attccataga 2620 
aaagccttga cttgaggtta gatttttttt atattttgtt ttgtgttatt tttttcttta 2880 
acatccctaa aattttcctt acatgtttta ctagccagat ttttcctcct ctcctgacta 2 94 0 
ctcccagtca tagctgtccc tcttctctta tgaagatccc tcgacctgca gcccaagctt 3000 
ggcgtaatca tggtcatagc tgtttcctgt gtgaaattgt tatccgctca caattccaca 3 06 0 
caacatacga gccggaagca taaagtgtaa agcctggggt gcctaatgag tgagctaact 312 0 



ccgcaaaaaa gggaataagg 24 60 

aatattattg aagcatttat 252 0 

tttagaaaaa taaacaaata 2580 

tctaagaaac cattattatc 2640 

ttcgtc 26B6 



cacattaatt gcgttgcgct cactgcccgc 
gatccgcatc tcaattagtc agcaaccata 
ctaactccgc ccagttccgc ccattctccg 
gcagaggccg aggccgcctc ggcctctgag 
ggaggctagt ggatcccccg ccccgtatcc 
agaagcgttc agaggaaagc gatcccgtgc 
cgctgccggc tcggggatgc ggggggagcg 
ctgctgcccc ctagcggggg agggacgtaa 
gtccccgtga gcggatccgc ggccccgtat 
cgagaagcgt tcagaggaaa gcgatcccgt 
cacgctgccg gctcggggat gcggggggag 
cgctgctgcc ccctagcggg ggagggacgt 
ctgtccccgt gagcggatcc gcggccccgt 
agcgagaagc gttcagagga aagcgatccc 
cgcacgctgc cggctcgggg afcgcgggggg 
ctcgctgctg ccccctagcg ggggagggac 
ggctgtcccc gtgagcggat ccgcggcccc 
gcagcgagaa gcgttcagag gaaagcgatc 
cccgcacgct gccggctcgg ggatgcgggg 
ggctcgctgc tgccccctag cgggggaggg 
ggggctgtcc ccgtgagcgg atccgcggcc 
gagcagcgag aagcgttcag aggaaagcga 
tccccgcacg ctgccggctc ggggatgcgg 
gcggctcgct gctgccccct agcgggggag 
ggggggctgt ccccgtgagc ggatccgcgg 
aagagcagcg agaagcgttc agaggaaagc 
tgtccccgca cgctgccggc tcggggatgc 
gggcggctcg ctgctgcccc ctagcggggg 
9gg99999 ct gtccccgtga gcggatccgc 
tttttatact aacttgagcg aaatcaagct 
attgcagctt ataatggtta caaataaagc 
tttttttcac tgcattctag ttgtggtttg 
tggatccgct gcattaatga atcggccaac 
gctcttccgc ttcctcgctc actgactcgc 
tatcagctca ctcaaaggcg gtaatacggt 
agaacatgtg agcaaaaggc cagcaaaagg 
cgtttttcca taggctccgc ccccctgacg 
ggtggcgaaa cccgacagga ctataaagat 
tgcgctctcc tgttccgacc ctgccgctta 
gaagcgtggc gctttctcaa tgctcacgct 
gctccaagct gggctgtgtg cacgaacccc 
gtaactatcg tcttgagtcc aacccggtaa 
ctggtaacag gattagcaga gcgaggtatg 
ggcctaacta cggctacact agaaggacag 
ttaccttcgg aaaaagagtt ggtagctctt 
gtggtttttt tgtttgcaag cagcagatta 
ctttgatctt ttctacgggg tctgacgctc 
tggtcatgag attatcaaaa aggatcttca 
ttaaatcaat ctaaagtata tatgagtaaa 
gtgaggcacc tatctcagcg atctgtctat 
tcgtgtagat aactacgata cgggagggct 
cgcgagaccc acgctcaccg gctccagatt 
ccgagcgcag aagtggtcct gcaactttat 
gggaagctag agtaagtagt tcgccagtta 
caggcatcgt ggtgtcacgc tcgtcgtttg 
gatcaaggcg agttacatga tcccccatgt 
ctccgatcgt tgtcagaagt aagttggccg 
tgcataattc tcttactgtc atgccatccg 
caaccaagtc attctgagaa tagtgtatgc 
tacgggataa taccgcgcca catagcagaa 
cttcggggcg aaaactctca aggatcttac 
ctcgtgcacc caactgatct tcagcatctt 
aaacaggaag gcaaaatgcc gcaaaaaagg 
tcatactctt cctttttcaa tattattgaa 
gatacatatt tgaatgtatt tagaaaaata 
gaaaagtgcc acctggtcga cggtatcgat 
ggatccgctc acggggacag ccccccccca 
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tttccagtcg ggaaacctgt cgtgccagcg 3180 
gtcccgcccc taactccgcc catcccgccc 324 0 
ccccatggct gactaatttt ttttatttat 3300 
ctattccaga agtagtgagg aggctttttt 3360 
cccaggtgtc tgcaggctca aagagcagcg 3420 
caccttcccc gtgcccgggc tgtccccgca 3480 
ccggaccgga gcggagcccc gggcggctcg 3 54 0 
ttacatccct gggggctttg ggggggggct 3 600 
cccccaggtg tctgcaggct caaagagcag 3 660 
gccaccttcc ccgtgcccgg gctgtccccg 372 0 
cgccggaccg gagcggagcc ccgggcggct 3 7 80 
aattacatcc ctgggggctt tggggggggg 3 84 0 
atcccccagg tgtctgcagg ctcaaagagc 3 900 
gtgccacctt ccccgtgccc gggctgtccc 3960 
agcgccggac cggagcggag ccccgggcgg 4 02 0 
gtaattacat ccctgggggc tttggggggg 4 08 0 
gtatccccca ggtgtctgca ggctcaaaga 4140 
ccgtgccacc ttccccgtgc ccgggctgtc 4200 
ggagcgccgg accggagcgg agccccgggc 4260 
acgtaattac atccctgggg gctttggggg 432 0 
ccgtatcccc caggtgtctg caggctcaaa 4380 
tcccgtgcca ccttccccgt gcccgggctg 444 0 
ggggagcgcc ggaccggagc ggagccccgg 450 0 
ggacgtaatt acatccctgg gggctttggg 456 0 
ccccgtatcc cccaggtgtc tgcaggctca 4 62 0 
gatcccgtgc caccttcccc gtgcccgggc 468 0 
ggggggagcg ccggaccgga gcggagcccc 474 0 
agggacgtaa ttacatccct gggggctttg 4800 
ggggctgcag gaattcgatt gaagcctgct 4 860 
cctaggcttt tgcaaaaagc taacttgttt 4 92 0 
aatagcatca caaatttcac aaataaagca 4980 
tccaaactca tcaatgtatc ttatcatgtc 5040 
gcgcggggag aggcggtttg cgtattgggc 5100 
tgcgctcggt cgttcggctg cggcgagcgg 5160 
tatccacaga atcaggggat aacgcaggaa 5220 
ccaggaaccg taaaaaggcc gcgttgctgg 52 80 
agcatcacaa aaatcgacgc tcaagtcaga 5340 
accaggcgtt tccccctgga agctccctcg 54 0 0 
ccggatacct gtccgccttt ctcccttcgg 54 6Q 
gtaggtatct cagttcggtg taggtcgttc 552 0 
ccgttcagcc cgaccgctgc gccttatccg 5580 
gacacgactt atcgccactg gcagcagcca 564 0 
taggcggtgc tacagagttc ttgaagtggt 5 70 0 
tatttggtat ctgcgctctg ctgaagccag 5760 
gatccggcaa acaaaccacc gctggtagcg 5 820 
cgcgcagaaa aaaaggatct caagaagatc 5 8 80 
agtggaacga aaactcacgt taagggattt 5940 
cctagatcct tttaaattaa aaatgaagtt 6000 
cttggtctga cagttaccaa tgcttaatca 6060 
ttcgttcatc catagttgcc tgactccccg 6120 
taccatctgg ccccagtgct gcaatgatac 618 0 
tatcagcaat aaaccagcca gccggaaggg 6240 
ccgcctccat ccagtctatt aattgttgcc 63 00 
atagtttgcg caacgttgtt gccattgcta 6360 
gtatggcttc attcagctcc ggttcccaac 6420 
tgtgcaaaaa agcggttagc tccttcggtc 6480 
cagtgttatc actcatggtt atggcagcac 6540 
taagatgctt ttctgtgact ggtgagtact 660 0 
ggcgaccgag ttgctcttgc ccggcgtcaa 6660 
ctttaaaagt gctcatcatt ggaaaacgtt 6720 
cgctgttgag atccagttcg atgtaaccca 6780 
ttactttcac cagcgtttct gggtgagcaa 6840 
gaataagggc gacacggaaa tgttgaatac 690 0 
gcatttatca gggttattgt ctcatgagcg 6960 
aacaaatagg ggttccgcgc acatttcccc 7020 
aagcttgata tcgaattcct gcagccccgc 7080 
aagcccccag ggatgtaatt acgtccctcc 7140 
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cccgctaggg ggcagcagcg agccgcccgg 
atccccgagc cggcagcgtg cggggacagc 
tttcctctga acgcttctcg ctgctctttg 
gcggatccgc tcacggggac agcccccccc 
cccccgctag ggggcagcag cgagccgccc 
gcatccccga gccggcagcg tgcggggaca 
gctttcctct gaacgcttct cgctgctctt 
ccgcggatcc gctcacgggg acagcccccc 
ctcccccgct agggggcagc agcgagccgc 
ccgcatcccc gagccggcag cgtgcgggga 
tcgctttcct ctgaacgctt ctcgctgctc 
ggccgcggat ccgctcacgg ggacagcccc 
ccctcccccg ctagggggca gcagcgagcc 
ccccgcatcc ccgagccggc agcgtgcggg 
gatcgctttc ctctgaacgc ttctcgctgc 
ggggccgcgg atccgctcac ggggacagcc 
gtccctcccc cgctaggggg cagcagcgag 
ccccccgcat ccccgagccg gcagcgtgcg 
gggatcgctt tcctctgaac gcttctcgct 
acggggccgc ggatccgctc acggggacag 
acgtccctcc cccgctaggg ggcagcagcg 
ctccccccgc atccccgagc cggcagcgtg 
acgggatcgc tttcctctga acgcttctcg 



ggctccgctc cggtccggcg ctccccccgc 7200 
ccgggcacgg ggaaggtggc acgggatcgc 72 60 
agcctgcaga cacctggggg atacggggcc 7320 
caaagccccc agggatgtaa ttacgtccct 73 80 
ggggctccgc tccggtccgg cgctcccccc 744 0 
gcccgggcac ggggaaggtg gcacgggatc 75 00 
tgagcctgca gacacctggg ggatacgggg 7560 
cccaaagccc ccagggatgt aattacgtcc 762 0 
ccggggctcc gctccggtcc ggcgctcccc 7680 
cagcccgggc acggggaagg tggcacggga 774 0 
tttgagcctg cagacacctg ggggatacgg 7800 
cccccaaagc ccccagggat gtaattacgt 7860 
gcccggggct ccgctccggt ccggcgctcc 7 92 0 
gacagcccgg gcacggggaa ggtggcacgg 7980 
tctttgagcc tgcagacacc tgggggatac 8 04 0 
cccccccaaa gcccccaggg atgtaattac 8100 
ccgcccgggg ctccgctccg gtccggcgct 8160 
gggacagccc gggcacgggg aaggtggcac 822 0 
gctctttgag cctgcagaca cctgggggat 82 80 
ccccccccca aagcccccag ggatgtaatt 8 34 0 
agccgcccgg ggctccgctc cggtccggcg 84 00 
cggggacagc ccgggcacgg ggaaggtggc 8460 
ctgctctttg agcctgcaga cacctggggg 8520 

8521 
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<400> 124 

cagttgccgg ccgggtcgcg cagggcgaac 
gtcatggccg gcccggaggc gtcccggaag 
tacagctcgt ccaggccgcg cacccacacc 
tcctggaccg cgctgatgaa cagggtcacg 
tccacgaagt cccgggagaa cccgagccgg 
tcgcgcgcgg tgagcaccgg aacggcactg 
caagttagta taaaaaagca ggcttcaatc 
agccccgcgg atccgctcac ggggacagcc 
gtccctcccc cgctaggggg cagcagcgag 
ccccccgcat ccccgagccg gcagcgtgcg 
gggatcgctt tcctctgaac gcttctcgct 
acggggccgc ggatccgctc acggggacag 
acgtccctcc cccgctaggg ggcagcagcg 
ctccccccgc atccccgagc cggcagcgtg 
acgggatcgc tttcctctga acgcttctcg 
atacggggcc gcggatccgc tcacggggac 
ttacgtccct cccccgctag ggggcagcag 
cgctcccccc gcatccccga gccggcagcg 
gcacgggatc gctttcctct gaacgcttct 
ggatacgggg ccgcggatcc gctcacgggg 
aattacgtcc ctcccccgct agggggcagc 
ggcgctcccc ccgcatcccc gagccggcag 
tggcacggga tcgctttcct ctgaacgctt 
ggggatacgg ggccgcggat ccgctcacgg 
gtaattacgt ccctcccccg ctagggggca 
ccggcgctcc ccccgcatcc ccgagccggc 
ggtggcacgg gatcgctttc ctctgaacgc 
tgggggatac ggggccgcgg atccgctcac 
atgtaattac gtccctcccc cgctaggggg 
gtccggcgct ccccccgcat ccccgagccg 
aaggtggcac gggatcgctt tcctctgaac 
cctgggggat acggggcggg ggatccacta 
tagttcatag cccatatatg gagttccgcg 



tcccgccccc acggctgctc gccgatctcg 6 0 
ttcgtggaca cgacctccga ccactcggcg 12 0 
caggccaggg tgttgtccgg caccacctgg 18 0 
tcgtcccgga ccacaccggc gaagtcgtcc 24 0 
tcggtccaga actcgaccgc tccggcgacg 300 
gtcaacttgg ccatggatcc agatttcgct 36 0 
ctgcagagaa gcttgatatc gaattcctgc 42 0 
cccccccaaa gcccccaggg atgtaattac 4 80 
ccgcccgggg ctccgctccg gtccggcgct 54 0 
gggacagccc gggcacgggg aaggtggcac 60 0 
gctctttgag cctgcagaca cctgggggat 660 
ccccccccca aagcccccag ggatgtaatt 72 0 
agccgcccgg ggctccgctc cggtccggcg 780 
cggggacagc ccgggcacgg ggaaggtggc 84 0 
ctgctctttg agcctgcaga cacctggggg 900 
agcccccccc caaagccccc agggatgtaa 960 
cgagccgccc ggggctccgc tccggtccgg 102 0 
tgcggggaca gcccgggcac ggggaaggtg 10 8 0 
cgctgctctt tgagcctgca gacacctggg 114 0 
acagcccccc cccaaagccc ccagggatgt 12 0 0 
agcgagccgc ccggggctcc gctccggtcc 126 0 
cgtgcgggga cagcccgggc acggggaagg 132 0 
ctcgctgctc tttgagcctg cagacacctg 1380 
ggacagcccc cccccaaagc ccccagggat 144 0 
gcagcgagcc gcccggggct ccgctccggt 15 0 0 
agcgtgcggg gacagcccgg gcacggggaa 156 0 
ttctcgctgc tctttgagcc tgcagacacc 162 0 
ggggacagcc cccccccaaa gcccccaggg 168 0 
cagcagcgag ccgcccgggg ctccgctccg 174,0 
gcagcgtgcg gggacagccc gggcacgggg 1800 
gcttctcgct gctctttgag cctgcagaca I860 
gttattaata gtaatcaatt acggggtcat 192 0 
ttacataact tacggtaaat ggcccgcctg 1980 
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gctgaccgcc 
cgccaatagg 
tggcagtaca 
aatggcccgc 
acatctacgt 
actctcccca 
ttttgtgcag 
cgaggggcgg 
ccgaaagttt 
gcggcgggcg 
ccgcccgccc 
ttctcctccg 
gcgtgaaagc 
gtgcgtgcgt 
tgagcgctgc 
gccgggggcg 
gtgtgtgcgt 
ctgcaccccc 
gggcgtggcg 
ggggcggggc 
ccggcggctg 
gggcgcaggg 
caccccctct 
gagggccttc 
cgcaggggga 
tgaccggcgg 
ccctgctgtc 
acagccgagt 
gctgtgctga 
tctatgcctg 
ccctgctgtc 
gggagcccct 
tgcttcgggc 
ctccactccg 
tcctccgggg 
gtacaagtaa 
ggccaatgcc 
ggacatcatg 
tgcaatagtg 
tcatt taaaa 
gctgccatga 
tccattcctt 
ttgtgttatt 
ttttcctcct 
tcgacctgca 
atcccccagg 
gtgccacctt 
agcgccggac 
gtaattacat 
gtatccccca 
ccgtgccacc 
ggagcgccgg 
acgtaattac 
ccgtatcccc 
tcccgtgcca 

ggggagcgcc 

ggacgtaatt 
ccccgtatcc 
gatcccgtgc 

ggggggagcg 

agggacgtaa 
ggccccgtat 
gcgatcccgt 
gcggggggag 
ggagggacgt 
gcggccccgt 
aagcgatccc 



caacgacccc 
gactttccat 
tcaagtgtat 
ctggcattat 
attagtcatc 
tctccccccc 
cgatgggggc 
ggcggggcga 
ccttttatgg 
ggagtcgctg 
cggctctgac 
ggctgtaatt 
cttaaagggc 
gtgtgtgtgc 
gggcgcggcg 
gtgccccgcg 

gggggggtga 

ctccccgagt 
cggggctcgc 
cgcctcgggc 
tcgaggcgcg 
acttcctttg 
agcgggcgcg 
gtgcgtcgcc 
cggctgcctt 
ctctagaatg 
gctccctctg 
cctggagagg 
acactgcagc 
gaagaggatg 
ggaagc t gt c 
gcagctgcat 
tctgggagcc 
aacaatcact 
aaagctgaag 
gaattcactc 
ctggctcaca 
aagccccttg 
tgttggaatt 
catcagaatg 
acaaaggtgg 
attccataga 
tttttcttta 
ctcctgacta 
gcccaagctt 
tgtctgcagg 
ccccgtgccc 
cggagcggag 
ccctgggggc 
ggtgtctgca 
ttccccgtgc 
accggagcgg 
atccctgggg 
caggtgtctg 
ccttccccgt 
ggaccggagc 
acatccctgg 
cccaggtgtc 
caccttcccc 
ccggaccgga 
ttacatccct 
cccccaggtg 
gccaccttcc 
cgccggaccg 
aattacatcc 
atcccccagg 
gtgccacctt 



cgcccattga 
tgacgtcaat 
catatgccaa 
gcccagtaca 
gctattacca 
ctccccaccc 

gggggggggg 

ggcggagagg 
cgaggcggcg 
cgttgccttc 
tgaccgcgtt 
agcgc 1 1 ggt 
tccgggaggg 
gtggggagcg 
cggggctttg 
gtgcgggggg 
gcagggggtg 
tgctgagcac 
cgtgccgggc 
cggggagggc 
gcgagccgca 
tcccaaatct 
ggcgaagcgg 
gcgccgccgt 
cgggggggac 
ggggtgcacg 
ggcctcccag 
tacctcttgg 
ttgaatgaga 
gaggtcgggc 
ctgcggggcc 
gtggataaag 
cagaaggaag 
gctgacactt 
ctgtacacag 
ctcaggtgca 
aataccactg 
agcatctgac 
ttttgtgtct 
agt at t tgg t 
ctataaagag 
aaagccttga 
acatccctaa 
ctcccagtca 
gcatgcctgc 
ctcaaagagc 
gggctgtccc 
ccccgggcgg 
tttggggggg 
ggc t caaaga 
ccgggctgtc 
agccccgggc 
gctttggggg 
caggctcaaa 
gcccgggctg 
ggagccccgg 
gggctttggg 
tgcaggctca 
gtgcccgggc 
gcggagcccc 

gggggctttg 
tctgcaggct 
ccgtgcccgg 
gagcggagcc 
ctgggggctt 
tgtctgcagg 
ccccgtgccc 



cgtcaataat 
gggtggacta 
gtacgccccc 
tgaccttatg 
tgggtcgagg 
ccaattttgt 

ggggcgcgcg 
tgcggcggca 
gcggcggcgg 
gccccgtgcc 
actcccacag 
ttaatgacgg 
ccctttgtgc 
ccgcgtgcgg 
tgcgctccgc 
gctgcgaggg 
tgggcgcggc 
ggcccggctt 

ggggggtggo 

tcgggggagg 
gccattgcct 
ggcggagccg 
tgcggcgccg 
ccccttctcc 

ggggcagggc 

aatgtcctgc 
tcctgggcgc 
aggccaagga 
atatcactgt 
agcaggccgt 
aggccctgtt 
ccgtcagtgg 
ccatctcccc 
tccgcaaact 
gggaggcctg 
ggctgcctat 
agatcttttt 
ttctggctaa 
ctcactcgga 
ttagagtttg 
gtcatcagta 
cttgaggtta 
aattttcctt 
tagctgtccc 
aggtcgactc 
agcgagaagc 
cgcacgctgc 
ctcgctgctg 
ggctgtcccc 
gcagcgagaa 
cccgcacgct 
ggctcgctgc 

ggggctgtcc 

gagcagcgag 
tccccgcacg 
gcggctcgct 

ggggggctgt 

aagagcagcg 
tgtccccgca 

gggcggctcg 
ggggggggct 

caaagagcag 
gctgtccccg 
ccgggcggct 
tggggggggg 
ctcaaagagc 
gggctgtccc 



gacgtatgtt 
tttacggtaa 
tattgacgtc 
ggactttcct 
tgagccccac 
atttatttat 
cc aggcgggg 
gccaatcaga 
ccctataaaa 
ccgctccgcg 
gtgagcgggc 
ctcgtttctt 

gggggggagc 

cccgcgctgc 
gtgtgcgcga 
gaacaaaggc 
ggtcgggctg 
cgggtgcggg 
ggcaggtggg 
ggcgcggcgg 
tttatggtaa 
aaatctggga 
gcaggaagga 
atctccagcc 

ggggttcggc 

ctggctgtgg 
cccaccacgc 
ggccgagaat 
cccagacacc 
agaagtctgg 
ggtcaactct 
ccttcgcagc 
tccagatgcg 
cttccgagtc 
caggacaggg 
cagaaggtgg 
ccctctgcca 
taaaggaaat 
aggacatatg 
gcaacatatg 
tatgaaacag 
gatttttttt 
acatgtttta 
tcttctctta 
tagtggatcc 
gttcagagga 
cggctcgggg 
ccccctagcg 
gtgagcggat 
gcgttcagag 
gccggctcgg 
tgccccctag 
ccgtgagcgg 
aagcgttcag 
ctgccggctc 
gctgccccct 
ccccgtgagc 
agaagcgt t c 
cgct gc cggc 
ctgctgcccc 
gtccccgtga 
cgagaagcgt 
cacgctgccg 
cgctgctgcc 
ctgtccccgt 
agcgagaagc 
cgcacgctgc 



cccatagtaa 
actgcccact 
aatgacggta 
acttggcagt 
gttctgcttc 
tttttaatta 
cggggcgggg 
gcggcgcgct 
agcgaagcgc 
ccgcctcgcg 
gggacggccc 
ttctgtggct 
ggctcggggg 
ccggcggctg 

ggggagcgcg 

tgcgtgcggg 
taaccccccc 
gctccgtgcg 
ggtgccgggc 
ccccggagcg 
tcgtgcgaga 
ggcgccgccg 
aatgggcggg 
tcggggctgc 
ttctggcgtg 
cttctcctgt 
ctcatctgtg 
atcacgacgg 
aaagttaatt 
cagggcctgg 
tcccagccgt 
ctcaccactc 
gcctcagctg 
tactccaatt 
gacagatgac 
tggctggtgt 
aaaattatgg 
ttattttcat 
ggagggcaaa 
ccatatgctg 
ccccctgctg 
atattttgtt 
ctagccagat 
tgaagatccc 
cccgccccgt 
aagcgatccc 
atgcgggggg 
ggggagggac 

ccgcggcccc 
gaaagcga t c 
ggatgcgggg 

cgggggaggg 

atccgcggcc 
aggaaagcga 

ggggatgcgg 

agcgggggag 
ggatccgcgg 
agaggaaag c 
tcggggatgc 
ctagcggggg 
gcggatccgc 
tcagaggaaa 
gctcggggat 
ccctagcggg 
gagcggatcc 
gttcagagga 
cggctcgggg 



2040 
2100 
2X60 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 
51O0 
5160 
5220 
5280 
5340 
5400 
5460 
5520 
5580 
5640 
5700 
5760 
5820 
5880 
5940 
6000 
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atgcgggggg 
ggggagggac 
ccgcggggct 
ccgctcacaa 
taatgagtga 
aacctgtcgt 
attgggcgct 
cgagcggtat 
gcaggaaaga 
ttgctggcgt 
agtcagaggt 
tccctcgtgc 
ccttcgggaa 
gt cgttcgct 
ttatccggta 
gcagccactg 
aagtggtggc 
aagccagtfca 
ggtagcggtg 
gaagatcctt 
gggattttgg 
tgaagtttta 
ttaatcagtg 
ctccccgtcg 
atgataccgc 
ggaagggccg 
tgttgccggg 
afctgctacag 
tcccaacgat 
ttcggtcctc 
gcagcactgc 
gagfcacfccaa 
gcgtcaatac 
aaacgttctt 
taacccactc 
tgagcaaaaa 
tgaatactca 
atgagcggat 
fcttccccgaa 
attaccaagc 
tgcgggcctc 
gttgggtaac 
acgactcact 
gafcgaglrtfcg 
tgtgatgcta 
gaagaactcc 
gattccgaag 
tcagtcctgc 



agcgccggac 
gtaattacat 
gcaggaattc 
ttccacacaa 
gctaactcac 
gccagctgca 
cttccgcttc 
cagctcactc 
acatgtgagc 
ttttccatag 
ggcgaaaccc 
gctctcctgt 
gcgtggcgct 
ccaagctggg 
actatcgtct 
gtaacaggat 
ctaactacgg 
ccttcggaaa 
gttttttfcgt 
tgatcttttc 
tcatgagatt 
aatcaatcta 
aggcacctat 
tgtagataac 
gagacccacg 
agcgcagaag 
aagctagagt 
gcatcgtggt 
caaggcgagt 
cgatcgttgt 
ataattctct 
ccaagtcatt 
gggataatac 
cggggcgaaa 
gtgcacccaa 
caggaaggca 
fcactcttcct 
acatatttga 
aagtgccacc 
gaagcgccat 
ttcgctatta 
gccagggttt 
taaggccttg 
gacaaaccac 
ttgctttatt 
agcatgagat 
cccaaccttt 
tcctcggcca 



cggagcggag 
ccctgggggc 
gtaatcatgg 
catacgagcc 
attaattgcg 
tfcaatgaatc 
cfccgcfccacfc 
aaaggcggta 
aaaaggccag 
gctccgcccc 
gacaggacta 
tccgaccctg 
ttctcatagc 
ctgtgtgcac 
tgagtccaac 
tagcagagcg 
ctacactaga 
aagagttggt 
ttgcaagcag 
tacggggtct 
atcaaaaagg 
aagtatatat 
ctcagcgatc 
tacgatacgg 
ctcaccggct 
tggtcctgca 
aagtagttcg 
gtcacgctcg 
tacatgatcc 
cagaagtaag 
tactgtcatg 
ctgagaatag 
cgcgccacat 
actctcaagg 
ctgatcttca 
aaatgccgca 
ttttcaatat 
atgtatttag 
tgacgtagtt 
tcgccattca 
cgcc age tgg 
tcccagtcac 
actagagggt 
aactagaatg 
tgtaaccatt 
ccccgcgctg 
catagaaggc 
egaagtgeae 



ccccgggcgg 
tttggggggg 
tcatagctgt 
ggaagcataa 
ttgcgctcac 
ggccaacgcg 
gactcgctgc 
ataeggttat 
caaaaggeca 
cctgacgagc 
taaagatacc 
.ccgcttaccg 
tcacgctgta 
gaaccccccg 
ceggtaagae 
aggtatgtag 
aggacagtat 
agefcettgafc 
cagattaege 
gaegctcagt 
atcttcacct 
gagtaaactt 
tgtctatttc 
gagggcttac 
ccagatttat 
actttatccg 
ccagttaata 
tcgtttggta 
cccatgttgt 
ttggccgcag 
ccatccgtaa 
tgtatgegge 
agcagaactt 
atcttaccgc 
gcatctttta 
aaaaagggaa 
tattgaagca 
aaaaataaac 
aacaaaaaaa 
ggctgcgcaa 
cgaaaggggg 
gacgttgtaa 
egaeggtata 
cagtgaaaaa 
ataagctgea 
gaggafceate 

ggcggtggaa 
s 



ctcgctgctg 
ggctgtcccc 
ttcctgtgtg 
agtgtaaagc 
tgcccgcfcfcfc 
eggggagagg 
gc t cggfccgt 
ccacagaatc 
ggaaccgtaa 
atcacaaaaa 
aggcgtfctcc 
gatacctgtc 
ggtatctcag 
ttcagcccga 
acgacttatc 
gcggtgctac 
ttggtatctg 
ccggcaaaca 
gcagaaaaaa 
ggaacgaaaa 
agatcctttt 
ggtctgacag 
gttcatccat 
catctggccc 
cagcaataaa 
cctccatcca 
gtttgcgcaa 
tggcttcatt 
gcaaaaaagc 
tgttatcact 
gatgetttte 
gaccgagttg 
taaaagtgct 
tgttgagatc 
ctttcaccag 
taagggegae 
tttatcaggg 
aaataggggt 
agcccgccga 
ctgttgggaa 
afcgfcgcfcgca 
aacgacggcc 
cagacatgat 
aatgetttat 
ataaacaagt 
cagccggcgt 
tcgaaatctc 



ccccctagcg 
gtgageggat 
aaattgttat 
ctggggfcgcc 
ecagteggga 
cggtttgcgt 
teggctgegg 
aggggat. aac 
aaaggccgcg 
tcgacgctca 
ccctggaagc 
cgcctttctc 
fctcggtgtag 
ccgctgcgcc 
gccactggca 
agagfctcttg 
cgctctgctg 
aaccaccgct 
aggatctcaa 
ctcacgttaa 
aaattaaaaa 
ttaccaatgc 
agttgcctga 
cagtgetgea 
ccagccagcc 
gtctattaat 
cgttgttgcc 
cagctccggt 
ggttagctcc 
catggttatg 
tgtgactggt 
ctcttgcccg 
catcattgga 
cagttcgatg 
cgtttctggg 
aeggaaatgt 
ttattgtctc 
tccgcgcaca 
agegggcttt 
gggegat egg 
aggegatt aa 
agtccgtaat 
aagatacatt 
ttgtgaaatt 
tggggtgggc 
cccggaaaac 
gtagcacgtg 



6060 
6120 
6180 
6240 
6300 
6360 
6420 
6480 
6540 
6600 
6660 
6720 
6780 
6840 
6900 
6960 
7020 
7080 
7140 
72 0 0 
7260 
7320 
7380 
7440 
7500 
7560 
7620 
7680 
7740 
7800 
7860 
7920 
7980 
8040 
8100 
8160 
8220 
8280 
8340 
8400 
8460 
8520 
8580 
8640 
8700 
8760 
8820 
8851 



<210> 125 
<211> 10474 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pl8genEPO Plasmid 



<400> 125 

cagttgeegg 

gtcatggccg 

fcacagctcgt 

tcctggaccg 

tccacgaagt 

tcgcgcgcgg 

caagttagta 

agccccgcgg 

gtccctcccc 



ccgggfccgcg 
gcccggaggc 
ccaggccgcg 
cgctgatgaa 
cccgggagaa 
tgagcacegg 
fcaaaaaagca 
atccgctcac 
cgctaggggg 



cagggegaac 
gtcccggaag 
cacccacacc 
c agggt c acg 
cccgagccgg 
aacggcactg 
ggcttcaatc 
ggggacagee 
cagcagegag 



tcccgccccc 
ttcgtggaca 
c aggee aggg 
tcgtcccgga 
fceggtccaga 
gtcaacttgg 
ctgcagagaa 
cccccccaaa 
ccgcccgggg 



acggctgctc 
cgacctccga 
tgttgtccgg 
ccacaccggc 
actcgaccgc 
ccatggatcc 
gcttgatatc 
gcccccaggg 
ctccgctccg 



gccgatctcg 
ccactcggcg 
caccacctgg 
gaagtegtec 
tccggcgacg 
agatttcget 
gaattcctgc 
atgtaattac 
gtccggcgct 



60 

120 

180 

240 

300 

360 

420 

480 

540 
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ccccccgcat 
gggatcgctt 
acggggccgc 
acgtccctcc 
ctccccccgc 
acgggatcgc 
atacggggcc 
ttacgtccct 
cgctcccccc 
gcacgggatc 
ggatacgggg 
aattacgtcc 
ggcgctCCCC 
tggcacggga 
ggggatacgg 
gtaattacgt 
ccggcgctcc 
ggtggcacgg 

tgggggatac 
atgtaattac 
gtccggcgct 
aaggt ggc ac 
cctgggggat 
tagttcatag 
gctgaccgcc 
cgccaatagg 
tggcagtaca 
aatggcccgc 
acatctacgt 
actctcccca 
ttttgtgcag 
cgaggggcgg 
ccgaaagttt 
gcggcgggcg 
ccgcccgccc 
ttctcctccg 
gcgtgaaagc 
gtgcgtgcgt 

tgagcgctgc 
gccgggggcg 
gtgtgtgcgt 
ctgcaccccc 
gggcgtggcg 

ggggcggggc 

ccggcggctg 
gggcgcaggg 
caccccctct 
gagggccttc 
cgcaggggga 
tgaccggcgg 
tcgccctttc 
ccgggtccct 
t caaggac eg 
tgccagcggg 
acagtttggg 
ctgataagct 
gtcacaccag 
gcacacggca 
ggggacagga 
gccacccttc 
ctggctgtgg 
cccaccacgc 
ggccgagaat 
gcttcaggga 
agctagacac 
ctaggcaagg 
ggacccttga 



ccccgagccg 
tcctctgaac 
ggatccgctc 
cccgctaggg 
atccccgagc 
tttcctctga 
gcggatccgc 
cccccgctag 
gcatccccga 
gctttcctct 
ccgcggatcc 
ctcccccgct 
ccgcatcccc 
tcgctttcct 
ggecgeggat 
ccctcccccg 
ccccgcatcc 
gategcttte 
ggggecgegg 
gtccctcccc 
ccccccgcat 
gggatcgctt 
aeggggeggg 
cccatatatg 
caacgacccc 
gactttccat 
tcaagtgtat 
ctggcattat 
attagtcatc 
tctccccccc 
cgatgggggc 
ggeggggega 
ccttttatgg 
ggagtcgctg 
cggctctgac 
ggctgtaatt 
cttaaagggc 
gtgtgtgtgc 
gggcgcggcg 
gtgccccgcg 

gggggggtga 

ctccccgagt 
cggggctcgc 
cgcctcgggc 
tegaggegeg 
acttcctttg 
agegggegeg 
gtgcgtcgcc 
cggctgcctt 
etctagatge 
tagaatgggg 
gtttgagegg 
gcgacttgtc 
gacttggggg 
ggttgagggg 
gataacctgg 
gattgaagtt 
gcaggattga 
aggacgagct 
tccctccccg 
cttctcctgt 
ctcatctgtg 
atcacggtga 
actcctccca 
tgccccccta 
ageaaageca 
ctccccgggc 



gcagcgtgcg 
gcttctcgct 
aeggggacag 
ggc age ag eg 
eggcagegtg 
acgcttctcg 
teaeggggae 
ggggcagcag 
gccggcagcg 
gaaegcttet 
getcaegggg 
agggggcagc 
gagceggcag 
ctgaacgett 
ccgctcacgg 
ctagggggca 
ccgagccggc 
ctctgaacgc 
atccgctcac 
cgctaggggg 
ccccgagccg 
tcctctgaac 
ggatccacta 
gagttccgcg 
cgcccattga 
tgacgtcaat 
catatgecaa 
gcccagtaca 
gctattacca 
ctzccccaccc 

gggggggggg 

ggeggagagg 
egaggeggeg 
cgttgccttc 
tgaccgcgtt 
agcgcttggt 
teegggaggg 
gtggggagcg 
eggggctttg 
gtgcgggggg 
gcagggggtg 
tgctgagcac 
cgtgccgggc 

eggggaggge 

gcgagccgca 
tcccaaatct 
ggegaagegg 
gcgccgccgt 
egggggggae 

atgetcgage 
gtgcacggtg 
ggatttagcg 
aaggaccccg 
agtccttggg 
aagaaggttt 
gcgctggagc 
tggceggaga 
atgaaggeca 
ggggcagaga 
cctgactctc 
ccctgctgtc 
acagecgagt 
gaccccttcc 
gatccaggaa 
cataagaata 
gcagatccta 
tgtgtgcatt 



gggacagccc 
gctctttgag 
ccccccccca 
agccgcccgg 
eggggacage 
ctgctctttg 
agcccccccc 
cgagccgccc 
tgeggggaca 
cgctgctctt 
acagcccccc 
agcgagccgc 
cgtgcgggga 
ctcgctgctc 
ggacagcccc 
gcagcgagcc 
agcgtgcggg 
ttctcgctgc 
gggga c age c 
cagcagegag 
gcagcgtgcg 
gcttctcgct 
gttattaata 
ttacataact 
cgtcaataat 
gggtggacta 
gtacgccccc 
tgaccttatg 
tgggtcgagg 
ccaattttgt 
ggggcgcgcg 

tgeggeggea 

geggeggegg 
gccccgtgcc 
actcccacag 
ttaatgaegg 
ccctttgtgc 
ccgcgtgcgg 
tgcgctccgc 
getgegaggg 
tgggegegge 
ggcccggctt 

ggggggtggc 

tegggggagg 
gccattgcct 
ggcggagccg 
tgcggcgccg 
ccccttctcc 

ggggcagggc 

ggccgccagt 
agtactcgcg 
ccccggctat 
gaagggggag 
gatggcaaaa 

gggggttctg 

caccacttat 
agtggatgct 
gggaggcagc 
cgtggggatg 
agectggcta 
gctccctctg 
cctggagagg 
ccagcacatt 
cctggcactt 
agtctggtgg 
cggcctgtgg 
teagaeggge 



gggcacgggg 
cctgcagaca 
aagcccccag 
ggctccgctc 
ccgggcacgg 
agectgeaga 
caaagccccc 

ggggctccgc 

gcccgggcac 
tgagectgea 
cccaaagccc 
ccggggctcc 
cagcccgggc 
tttgagcctg 
cccccaaagc 
gcccggggct 
gacagcccgg 
tetttgagee 
cccccccaaa 
ccgcccgggg 
gggacagccc 
gctctttgag 
gtaatcaatt 
tacggtaaat 
gacgtatgtt 
tttacggtaa 
tattgaegtc 
ggactttcct 
tgagccccac 
atttatttat 
ecaggegggg 
gecaatcaga 
ccctataaaa 
ccgctccgcg 
gtgageggge 
ctegtttett 

gggggggagc 

cccgcgctgc 
gtgtgcgcga 
gaacaaaggc 
gg t eggge t g 
cgggtgcggg 
ggcaggtggg 
ggegeggegg 
tttatggtaa 
aaatctggga 
gcaggaagga 
atctccagcc 

ggggttcggc 

gtgatggata 
ggctgggcgc 
tggecaggag 
gggggtgggg 

acctgacctg 
ctgtgccagt 
ctgecagagg 
ggtagctggg 
acctgagtgc 
aaggaagctg 
tctgttctag 
ggcctcccag 
tacctcttgg 
ccacagaact 
ggtttggggt 
ccccaaacca 
gccagggcca 
tgtgctgaac 



aaggt ggcac 
cctgggggat 
ggatgtaatt 
cggtccggcg 
ggaaggtggc 
cacctggggg 
agggatgtaa 
tccggtccgg 
ggggaaggtg 
gacacctggg 
ccagggatgt 
gctccggtcc 
aeggggaagg 
cagacacctg 
ccccagggat 
ccgctccggt 
geaeggggaa 
tgcagacacc 
gcccccaggg 
ctccgctccg 
gggcacgggg 
cctgcagaca 
aeggggtcat 
ggcccgcctg 
cccatagtaa 
actgcccact 
aatgacggta 
acttggcagt 
gttctgette 
tttttaatta 
eggggegggg 
gcggcgcgct 
agegaagege 
ccgcctcgcg 
gggacggccc 
ttctgtggct 
ggctcggggg 
ccggcggctg 

ggggagcgcg 

tgcgtgcggg 
taaccccccc 
gctccgtgcg 
ggtgccgggc 
ccccggagcg 
tegtgegaga 
ggcgccgccg 
aatgggcggg 
teggggctge 
ttctggcgtg 
tetgeagaat 
tcccgcccgc 
gtggctgggt 
tgcctccacg 
tgaaggggac 
ggagaggaag 
ggaagectet 

ggtggggtgt 

ttgcatggtt 
tccttccaca 
aatgtcctgc 
tcctgggcgc 
aggecaagga 
cacgctcagg 
ggagttggga 
tacctggaaa 
gagecttcag 
actgeagett 



600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

4020 

4080 

4140 

4200 

4260 

4320 

4380 

4440 

4500 

4560 



gaatgagaat atcactgtcc cagacaccaa 
ggtgagttcc tttttttttt tttttccttt 
tttggatgaa agggagaatg atcgagggaa 
cctgggcgca gaggctcacg tctataatcc 
cttgagccct ggagtttcag accaacctag 
acatttaaaa aaattagtca ggtgaagtgg 
gctgaggcgg gaggatcgct tgagcccagg 
ccactgcact ccagcctcag tgacagagtg 
gaaaaataat gagggctgta tggaatacat 
ttcattcatt cattcattca acaagtctta 
cttggggctg ctgaggggca ggagggagag 
cactccctgt aggtcgggca gcaggccgta 
gaagctgtcc tgcggggcca ggccctgttg 
cagctgcatg tggataaagc cgtcagtggc 
ctgggagccc aggtgagtag gagcggacac 
aagggtcttg ctaaggagta caggaactgt 
gcgacctcct gttttctcct tggcagaagg 
ctgctccact ccgaacaatc actgctgaca 
atttcctccg gggaaagctg aagctgtaca 
gacgtacaag taagaattca ctcctcaggt 
tgtggccaat gccctggctc acaaatacca 
tggggacatc atgaagcccc ttgagcatct 
cattgcaata gtgtgttgga attttttgtg 
aaatcattta aaacatcaga atgagtattt 
ctggctgcca tgaacaaagg tggctataaa 
ctgtccattc cttattccat agaaaagcct 
gttttgtgtt atttttttct ttaacatccc 
gatttttcct cctctcctga ctactcccag 
ccctcgacct gcagcccaag cttgcatgcc 
cgtatccccc aggtgtctgc aggctcaaag 
cccgtgccac cttccccgtg cccgggctgt 
gggagcgccg gaccggagcg gagccccggg 
gacgtaatta catccctggg ggctttgggg 
cccgtatccc ccaggtgtct gcaggctcaa 
atcccgtgcc accttccccg tgcccgggct 
gggggagcgc cggaccggag cggagccccg 
gggacgtaat tacatccctg ggggctttgg 
gccccgtatc ccccaggtgt ctgcaggctc 
cgatcccgtg ccaccttccc cgtgcccggg 
cggggggagc gccggaccgg agcggagccc 
gagggacgta attacatccc tgggggcttt 
cggccccgta tcccccaggt gtctgcaggc 
agcgatcccg tgccaccttc cccgtgcccg 
tgcgggggga gcgccggacc ggagcggagc 
gggagggacg taattacatc cctgggggct 
cgcggccccg tatcccccag gtgtctgcag 
aaagcgatcc cgtgccacct tccccgtgcc 
gatgcggggg gagcgccgga ccggagcgga 
99999 a 999 a cgtaattaca tccctggggg 
tccgcggccc cgtatccccc aggtgtctgc 
ggaaagcgat cccgtgccac cttccccgtg 
gggatgcggg gggagcgccg gaccggagcg 
gcgggggagg gacgtaatta catccctggg 
gatccgcggg gctgcaggaa ttcgtaatca 
tatccgctca caattccaca caacatacga 
gcctaatgag tgagctaact cacattaatt 
ggaaacctgt cgtgccagct gcattaatga 
cgtattgggc gctcttccgc ttcctcgctc 
cggcgagcgg tatcagctca ctcaaaggcg 
aacgcaggaa agaacatgtg agcaaaaggc 
gcgttgctgg cgtttttcca taggctccgc 
tcaagtcaga ggtggcgaaa cccgacagga 
agctccctcg tgcgctctcc tgttccgacc 
ctcccttcgg gaagcgtggc gctttctcat 
taggtcgttc gctccaagct gggctgtgtg 
gccttatccg gtaactatcg tcttgagtcc 
gcagcagcca ctggtaacag gattagcaga 
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agttaatttc tatgcctgga agaggatgga 4620 
cttttggaga atctcatttg cgagcctgat 4680 
aggtaaaatg gagcagcaga gatgaggctg 4740 
caggctgaga tggccgagat gggagaattg 4 800 
gcagcatagt gagatccccc atctctacaa 4 8 60 
tgcatggtgg tagtcccaga tatttggaag 492 0 
aatttgaggc tgcagtgagc tgtgatcaca 4 980 
aggccctgtc tcaaaaaaga aaagaaaaaa 5040 
tcattattca ttcactcact cactcactca 5100 
ttgcatacct tctgtttgct cagcttggtg 5160 
ggtgacatgg gtcagctgac tcccagagtc 5220 
gaagtctggc agggcctggc cctgctgtcg 52 80 
gtcaactctt cccagccgtg ggagcccctg 53 4 0 
cttcgcagcc tcaccactct gcttcgggct 54 00 
ttctgcttgc cctttctgta agaaggggag 54 60 
ccgtattcct tccctttctg tggcactgca 5520 
aagccatctc ccctccagat gcggcctcag 55 80 
ctttccgcaa actcttccga gtctactcca 5640 
caggggaggc ctgcaggaca ggggacagat 5 7 00 
gcaggctgcc tatcagaagg tggtggctgg 5760 
ctgagatctt tttccctctg ccaaaaatta 5820 
gacttctggc taataaagga aatttatttt 58 80 
tctctcactc ggaaggacat atgggagggc 5 940 
ggtttagagt ttggcaacat atgccatatg 60 00 
gaggtcatca gtatatgaaa cagccccctg 6 0 60 
tgacttgagg ttagattttt tttatatttt 6120 
taaaattttc cttacatgtt ttactagcca 6180 
tcatagctgt ccctcttctc ttatgaagat 6240 
tgcaggtcga ctctagtgga tcccccgccc 63 00 
agcagcgaga agcgttcaga ggaaagcgat 63 60 
ccccgcacgc tgccggctcg gggatgcggg 6420 
cggctcgctg ctgcccccta gcgggggagg 64 80 
gggggctgtc cccgtgagcg gatccgcggc 6540 
agagcagcga gaagcgttca gaggaaagcg 6600 
gtccccgcac gctgccggct cggggatgcg 6660 
ggcggctcgc tgctgccccc tagcggggga 6720 
gggggggctg tccccgtgag cggatccgcg 67 80 
aaagagcagc gagaagcgtt cagaggaaag 6 84 0 
ctgtccccgc acgctgccgg ctcggggatg 6900 
cgggcggctc gctgctgccc cctagcgggg 6960 
999999999° tgtccccgtg agcggatccg 7 020 
tcaaagagca gcgagaagcg ttcagaggaa 70 80 
ggctgtcccc gcacgctgcc ggctcgggga 7140 
cccgggcggc tcgctgctgc cccctagcgg 72 00 
"99999999 gctgtccccg tgagcggatc 72 60 
gctcaaagag cagcgagaag cgttcagagg 7320 
cgggctgtcc ccgcacgctg ccggctcggg 73 80 
gccccgggcg gctcgctgct gccccctagc 7440 
ctttgggggg gggctgt ccc cgtgagcgga 75 00 
aggctcaaag agcagcgaga agcgttcaga 7560 
cccgggctgt ccccgcacgc tgccggctcg 7620 
gagccccggg cggctcgctg ctgcccccta 76 80 
ggctttgggg gggggctgtc cccgtgagcg 774 0 
tggtcatagc tgtttcctgt gtgaaattgt 78 00 
gccggaagca taaagtgtaa agcctggggt 7 8 60 
gcgttgcgct cactgcccgc tttccagtcg 792 0 
atcggccaac gcgcggggag aggcggtttg 7980 
actgactcgc tgcgctcggt cgttcggctg 804 0 
gtaatacggt tatccacaga atcaggggat 810 0 
cagcaaaagg ccaggaaccg taaaaaggcc 816 0 
ccccctgacg agcatcacaa aaatcgacgc 822 0 
ctataaagat accaggcgtt tccccctgga 82 80 
ctgccgctta ccggatacct gtccgccttt 834 0 
agctcacgct gtaggtatct cagttcggtg 84 00 
cacgaacccc ccgttcagcc cgaccgctgc 8460 
aacccggtaa gacacgactt atcgccactg 852 0 
gcgaggtatg taggcggtgc tacagagttc 85 8 0 



-99- 



ttgaagtggt 
ctgaagccag 
gctggtagcg 
caagaagatc 
taagggattt 
aaatgaagtt 
tgcttaatca 
tgactccccg 
gcaatgatac 
gccggaaggg 
aattgttgcc 
gccattgcta 
ggttcccaac 
tccttcggtc 
atggcagcac 
ggtgagtact 
ccggcgtcaa 
ggaaaacgtt 
atgtaaccca 
gggtgagcaa 
tgttgaatac 
ctcatgagcg 
acatttcccc 
tttattacca 
cggtgcgggc 
taagttgggt 
aatacgactc 
attgatgagt 
atttgtgatg 
ggcgaagaac 
aacgattccg 
gtgtcagtcc 



ggcctaacta 
ttaccttcgg 
gtggtttttt 
ctttgatctt 
tggtcatgag 
ttaaatcaat 
gtgaggcacc 
tcgtgtagat 
cgcgagaccc 
ccgagcgcag 
gggaagctag 
caggcatcgt 
gatcaaggcg 
ctccgatcgt 
tgcataattc 
caaccaagtc 
tacgggataa 
cttcggggcg 
ctcgtgcacc 
aaacaggaag 
tcatactctt 
gatacatatt 
gaaaagtgcc 
agcgaagcgc 
ctcttcgcta 
aacgccaggg 
acttaaggcc 
ttggacaaac 
ctattgcttt 
tccagcatga 
aagcccaacc 
tgctcctcgg 



cggctacact 
aaaaagagtt 
tgtttgcaag 
ttctacgggg 
attatcaaaa 
ctaaagtata 
tatctcagcg 
aactacgata 
acgctcaccg 
aagtggt cct 
agtaagtagt 
ggtgtcacgc 
agttacatga 
tgtcagaagt 
tcttactgtc 
attctgagaa 
taccgcgcca 
aaaactctca 
caactgatct 
gcaaaatgcc 
cctttttcaa 
tgaatgtatt 
acctgacgta 
cattcgccat 
ttacgccagc 
ttttcccagt 
ttgactagag 
cacaactaga 
atttgtaacc 
gatccccgcg 
tttcatagaa 
ccacgaagtg 



agaaggacag 
ggtagctctt 
cagcagatta 
tctgacgctc 
aggatcttca 
tatgagtaaa 
atctgtctat 
cgggagggct 
gctccagatt 
gcaactttat 
tcgccagtta 
tcgtcgtttg 
tcccccatgt 
aagttggccg 
atgccatccg 
tagtgtatgc 
catagcagaa 
aggatcttac 
tcagcatctt 
gcaaaaaagg 
tattattgaa 
tagaaaaata 
gttaacaaaa 
tcaggctgcg 
tggcgaaagg 
cacgacgttg 
ggtcgacggt 
atgcagtgaa 
attataagct 
ctggaggatc 
ggcggcggtg 
cacg 



tatttggtat 
gat ccggcaa 
cgcgcagaaa 
agtggaacga 
cctagatcct 
cttggtctga 
ttcgttcatc 
taccatctgg 
tatcagcaat 
ccgcctccat 
atagtttgcg 
gtatggcttc 
tgtgcaaaaa 
cagtgttatc 
taagatgctt 
ggcgaccgag 
ctttaaaagt 
cgctgttgag 
ttactttcac 
gaataagggc 
gcatttatca 
aacaaatagg 
aaaagcccgc 
caactgttgg 
gggatgtgct 
taaaacgacg 
atacagacat 
aaaaatgctt 
gcaataaaca 
atccagccgg 
gaatcgaaat 



ctgcgctctg 
acaaaccacc 
aaaaggatct 
aaactcacgt 
tttaaattaa 
cagttaccaa 
catagttgcc 
ccccagtgct 
aaaccagcca 
ccagtctatt 
caacgttgtt 
attcagctcc 
ag egg 1 t age 
actcatggtt 
ttctgtgact 
ttgctcttgc 
gctcatcatt 
atccagttcg 
cagegtttet 
gaeaeggaaa 
gggttattgt 
ggttccgcgc 
egaageggge 
gaagggegat 
geaaggegat 
gccagtccgt 
gataagatac 
tatttgtgaa 
agttggggtg 
cgtcccggaa 
c t eg t agcac 



<210> 126 
<211> 6119 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pl8attBZeoeGFP Plasmid 



<400> 126 

cagttgeegg 

gtcatggccg 

tacagctcgt 

tcctggaccg 

tccacgaagt 

tcgcgcgcgg 

caagttagta 

gatcttcata 

ctggctagta 

caaaatataa 

gcagggggct 

gcatatggca 

tgccctccca 

gaaaataaat 

ataatttttg 

accagccacc 

ctcgtccatg 

gcgcttctcg 

cagcagcacg 

gctgccgtcc 

gtcggccatg 

gttgccgtcc 

ctcgaacttc 

ctcctggacg 

ggggtagegg 



ccgggtcgcg 
gcccggaggc 
ccaggccgcg 
cgctgatgaa 
cccgggagaa 
tgagcacegg 
taaaaaagca 
agagaagagg 
aaacatgtaa 
aaaaaatcta 
gtttcatata 
tatgttgcca 
tatgtccttc 
ttcctttatt 
gcagagggaa 
accttctgat 
ccgagagtga 
ttggggtctt 
gggccgtcgc 
tcgatgttgt 
atatagacgt 
tccttgaagt 
acctcggcgc 
tagecttegg 
ctgaagcact 



cagggegaac 
gtcccggaag 
cacccacacc 
cagggtcacg 
cccgagccgg 
aacggcactg 
ggcttcaatc 
gacagctatg 
ggaaaatttt 
acctcaagtc 
ctgatgacct 
aactctaaac 
cgagtgagag 
agecagaagt 
aaagatctca 
aggcagectg 
tcccggcggc 
tgetcaggge 
cgatgggggt 
ggeggatett 
tgtggctgtt 
cgatgccctt 
gggtcttgta 
gcatggcgga 
gcacgccgta 



tcccgccccc 
ttcgtggaca 
caggecaggg 
tcgtcccgga 
teggtccaga 
gtcaacttgg 
c t gcagagaa 
actgggagta 
agggatgtta 
aaggcttttc 
etttatagee 
caaatactca 
acacaaaaaa 
cagatgetea 
gtggtatttg 
cacctgagga 
ggtcacgaac 
ggactgggtg 
gttctgctgg 
gaagttcacc 
gtagttgtac 
cagctcgatg 
gttgeegteg 
cttgaagaag 
ggtcagggtg 



acggctgctc 
cgacctccga 
tgttgtccgg 
ccacaccggc 
actcgaccgc 
ccatggatcc 
gcttgggctg 
gtcaggagag 
aagaaaaaaa 
tatggaataa 
acctttgttc 
ttctgatgtt 
ttccaacaca 
aggggcttca 
tgagccaggg 
gtgaattctt 
tccagcagga 
ctcaggtagt 
tagtggtcgg 
ttgatgccgt 
tccagcttgt 
eggtt caeca 
tccttgaaga 
tegtgetget 
gtcacgaggg 



gccgatctcg 
ccactcggcg 
caccacctgg 
gaagtegtec 
tccggcgacg 
agatttcget 
caggtcgagg 
gaggaaaaat 
taacacaaaa 
ggaatggaca 
atggcagcca 
ttaaatgatt 
etattgeaat 
tgatgtcccc 
cattggccac 
acttgtacag 
ccatgtgatc 
ggttgtcggg 
cgagctgcac 
tettctgett 
gccccaggat 
gggtgtcgcc 
agatggtgcg 
tcatgtggtc 
tgggccaggg 



8640 

8700 

8760 

8820 

8880 

8940 

9000 

9060 

9120 

9180 

9240 

9300 

9360 

9420 

9480 

9540 

9600 

9660 

9720 

9780 

9840 

99O0 

9960 

10020 

10080 

10140 

10200 

10260 

10320 

10380 

10440 

10474 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 




-100- 

cacgggcagc ttgccggtgg tgcagatgaa cttcagggtc agcttgccgt aggtggcatc 1560 
gccctcgccc tcgccggaca cgctgaactt gtggccgttt acgtcgccgt ccagctcgac 1620 
caggatgggc accaccccgg tgaacagctc ctcgcccttg ctcaccatgg tggcgaattc 1680 
tttgccaaaa tgatgagaca gcacaacaac cagcacgttg cccaggagct gtaggaaaaa 1740 
gaagaaggca tgaacatggt tagcagaggc tctagagccg ccggtcacac gccagaagcc 1800 
gaaccccgcc ctgccccgtc ccccccgaag gcagccgt cc ccctgcggca gccccgaggc 1860 
tggagatgga gaaggggacg gcggcgcggc gacgcacgaa ggccctcccc gcccatttcc 1920 
ttcctgccgg cgccgcaccg cttcgcccgc gcccgctaga gggggtgcgg cggcgcctcc 1980 
cagatttcgg ctccgccaga tttgggacaa aggaagtccc tgcgccctct cgcacgatta 2040 
ccataaaagg caatggctgc ggctcgccgc gcctcgacag ccgccggcgc tccggggccg 2100 
ccgcgcccct cccccgagcc ctccccggcc cgaggcggcc ccgccccgcc cggcaccccc 2160 
acctgccgcc accccccgcc cggcacggcg agccccgcgc cacgccccgc acggagcccc 2220 
gcacccgaag ccgggccgtg ctcagcaact cggggagggg ggtgcagggg ggggttacag 22 80 
cccgaccgcc gcgcccacac cccctgctca cccccccacg cacacacccc gcacgcagcc 2340 
tttgttcccc tcgcagcccc cccgcaccgc ggggcaccgc ccccggccgc gctcccctcg 2400 
cgcacacgcg gagcgcacaa agccccgcgc cgcgcccgca gcgctcacag ccgccgggca 24 60 
gcgcgggccg cacgcggcgc tccccacgca cacacacacg cacgcacccc ccgagccgct 2520 
cccccccgca caaagggccc tcccggagcc ctttaaggct ttcacgcagc cacagaaaag 2580 
aaacgagccg tcattaaacc aagcgctaat tacagcccgg aggagaaggg ccgtcccgcc 2640 
cgctcacctg tgggagtaac gcggtcagtc agagccgggg cgggcggcgc gaggcggcgc 2700 
ggagcggggc acggggcgaa ggcaacgcag cgactcccgc ccgccgcgcg cttcgctttt 2760 
tatagggccg ccgccgccgc cgcctcgcca taaaaggaaa ctttcggagc gcgccgctct 2 820 
gattggctgc cgccgcacct ctccgcctcg. ccccgccccg cccctcgccc cgccccgccc 2880 
cgcctggcgc gcgccccccc cccccccgcc cccatcgctg cacaaaataa ttaaaaaata 2940 
aataaataca aaattggggg tggggagggg ggggagatgg ggagagtgaa gcagaacgtg 3000 
gggctcacct cgacccatgg taatagcgat gactaatacg tagatgtact gccaagtagg 3 060 
aaagtcccat aaggtcatgt actgggcata atgccaggcg ggccatttac cgtcattgac 312 0 
gtcaataggg ggcgtacttg gcatatgata cacttgatgt actgccaagt gggcagttta 3180 
ccgtaaatag tccacccatt gacgtcaatg gaaagtccct attggcgtta ctatgggaac 3240 
atacgtcatt attgacgtca atgggcgggg gtcgttgggc ggtcagccag gcgggccatt 33 00 
taccgtaagt .tatgtaacgc ggaactccat atatgggcta tgaactaatg accccgtaat 33 60 
tgattactat taataactag aggatccccg ggtaccgagc tcgaattcgt aatcafcggfcc 3420 
atagctgttt cctgtgtgaa attgttatcc gctcacaatt ccacacaaca tacgagccgg 34 80 
aagcataaag tgtaaagcct ggggtgccfca atgagtgagc taactcacat taattgcgtt 3540 
gcgctcactg cccgctttcc agtcgggaaa cctgtcgtgc cagctgcatt aatgaatcgg 3600 
ccaacgcgcg gggagaggcg gtttgcgtat tgggcgctct tccgcttcct cgctcactga 3 660 
ctcgctgcgc tcggtcgttc ggctgcggcg agcggtatca gctcactcaa aggcggtaat 3 720 
acggttatcc acagaatcag gggataacgc aggaaagaac atgtgagcaa aaggccagca 3 780 
aaaggccagg aaccgtaaaa aggccgcgtfc gctggcgttt ttccataggc tccgcccccc 3 840 
tgacgagcat cacaaaaatc gacgctcaag tcagaggtgg cgaaacccga caggactata 3900 
aagataccag gcgtttcccc ctggaagctc cctcgtgcgc tctcctgttc cgaccctgcc 3 960 
gcttaccgga tacctgtccg cctttctccc ttcgggaagc gtggcgcttt ctcatagctc 4020 
acgctgtagg tatctcagtt cggfcgfcaggt cgttcgctcc aagctgggct gtgtgcacga 4080 
accccccgtt cagcccgacc gctgcgcctt atccggtaac tatcgtcttg agtccaaccc 4140 
ggtaagacac gacttatcgc cactggcagc agccactggt aacaggatta gcagagcgag 4200 
gtatgtaggc ggtgctacag agttcttgaa gtggtggcct aactacggct acactagaag 4260 
gacagtattt ggtatctgcg ctctgctgaa gccagttacc ttcggaaaaa gagttggtag 4320 
ctcttgatcc ggcaaacaaa ccaccgctgg tagcggtggt ttttttgttt gcaagcagca 4380 
gattacgcgc agaaaaaaag gatctcaaga agatcctttg atcttttcta cggggtctga 4440 
cgctcagtgg aacgaaaact cacgttaagg gattttggtc atgagattat caaaaaggat 4500 
cttcacctag atccttttaa attaaaaatg aagttttaaa tcaatctaaa gtatatatga 4560 
gtaaacttgg tctgacagtt accaatgctt aatcagtgag gcacctatct cagcgatctg 4 620 
tctatttcgt tcatccatag ttgcctgact ccccgtcgtg tagataacta cgatacggga 4 680 
gggcttacca tctggcccca gtgctgcaat gataccgcga gacccacgct caccggctcc 4740 
agatttatca gcaataaacc agccagccgg aagggccgag cgcagaagtg gtcctgcaac 4800 
tttatccgcc tccatccagt ctattaattg ttgccgggaa gctagagtaa gtagttcgcc 4 860 
agttaatagt ttgcgcaacg ttgttgccat tgctacaggc atcgtggtgt cacgctcgtc 4920 
gtttggtatg gcttcattca gctccggttc ccaacgatca aggcgagtta catgatcccc 4980 
catgttgtgc aaaaaagcgg ttagctcctt cggtcctccg atcgttgtca gaagtaagtt 5040 
ggccgcagtg ttatcactca tggttatggc agcactgcat aattctctta ctgtcatgcc 5100 
atccgtaaga tgcttttctg tgactggtga gtactcaacc aagtcattct gagaatagtg 5160 
tatgcggcga ccgagttgct cttgcccggc gtcaatacgg gataataccg cgccacatag 5220 
cagaacttta aaagtgctca tcattggaaa acgttcttcg gggcgaaaac tctcaaggat 52 8 O 
cttaccgctg ttgagatcca gttcgatgta acccactcgt gcacccaact gatcttcagc 5340 
atcttttact ttcaccagcg tttctgggtg agcaaaaaca ggaaggcaaa atgccgcaaa 54 0 0 
aaagggaata agggcgacac ggaaatgttg aatactcata ctcttccttt ttcaatatta 54 60 
ttgaagcatt tatcagggtt attgtctcat gagcggatac atatttgaat gtatttagaa 5520 



-101 



a a at aaacaa 
caaaaaaaag 
ctgcgcaact 
aaagggggat 
cgttgtaaaa 
acggtataca 
gtgaaaaaaa 
aagctgcaat 
ggatcatcca 
cggtggaatc 



ataggggfctc 
cccgccgaag 
gttgggaagg 
gtgctgcaag 
cgacggccag 
gacatgataa 
tgctttattt 
aaacaagttg 
gccggcgtcc 
gaaatctcgt 



cgcgcacatt 
cgggctttat 
gcgatcggtg 
gcgattaagt 
tccgtaatac 
gatacattga 
gtgaaatfctg 
gggtgggcga 
cggaaaacga 
agcacgtgtc 



tccccgaaaa 
taccaagcga 
cgggcctctt 
tgggtaacgc 
gactcactta 
tgagtttgga 
tgatgctatt 
agaactccag 
ttccgaagcc 
agfccctgcfcc 



gtgccacctg 
. agcgccat t c 
cgctattacg 
cagggttttc 
aggccttgac 
caaaccacaa 
gctttatttg 
catgagatcc 
caacctttca 
ctcggccacg 



acgtagttaa 
gccattcagg 
ccagctggcg 
ccagtcacga 
tagagggtcg 
ctagaatgca 
taaccattat 
ccgcgctgga 
tagaaggcgg 
aagtgcacg 



5580 
5640 
5700 
5760 
5820 
5880 
5940 
6000 
6060 
61X9 



<210> 127 
<211> 5855 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pCXLamlnt Plasmid (Wildtype Integrase) 



<400> 127 

gtcgacatfcg 

gcccatatat 

ccaacgaccc 

ggactttcca 

atcaagtgta 

cctggcatta 

tattagtcat 

atctcccccc 

gcgatggggg 

gggcggggcg 

tccttttatg 

gggagtcgct 

ccggctctga 

gggctgtaat 

ccttaaaggg 

tgtgtgfcgtg 

cgggcgcggc 

ggtgccccgc 

tgggggggtg 

cctccccgag 
gcggggctcg 
ccgcctcggg 
gtcgaggcgc 
gacttccttt 
tagcgggcgc 
cgtgcgtcgc 
acggctgcct 
gctctagagc 
acgtgctggt 
gtcatgagcg 
acagggaccc 
ctgaagctat 
cgagaafccaa 
tcctggccag 
caataaggag 
caatgctcaa 
cactgagcga 
ctgccactcg 
tgaaaattta 
ctgttgttac 
atggatatct 
tgcatattga 
ttggcggaga 
caaggtattt 
cctttcacga 
ttgctcaaca 
gaggcaggga 



attattgact 
ggagttccgc 
ccgcccattg 
ttgacgtcaa 
tcatatgcca 
tgcccagtac 
cgctattacc 
cctccccacc 

cggggggggg 
aggcggagag 
gcgaggcggc 
gcgttgcctt 
ctgaccgcgt 
tagcgcttgg 
ctccgggagg 
cgtggggagc 
gcggggcttt 
ggtgcggggg 
agcagggggt 
fcfcgctgagca 
ccgtgccggg 
ccggggaggg 
ggcgagccgc 
gtcccaaatc 
gggcgaagcg 
cgcgccgccg 
tcggggggga 
ctctgctaac 
tgttgtgctg 
ccgggattta 
aaggacgggt 
acaggccaac 
cagtgataat 
cagaggaatc 
gggtctgcct 
tggatacata 
tgcattccga 
cgcagcaaaa 
tcaagcagca 
cgggcaacga 
ttatgtcgag 
tgctctcgga 
aaccataatt 
tatgcgcgca 
gttgcgcagt 
tcttctcggg 
gtgggacaaa 



agttattaat 
gttacataac 
acgtcaataa 
tgggtggact 
agtacgcccc 
atgaccttat 
atgggtcgag 
cccaattttg 
gggggcgcgc 
gtgcggcggc 

ggcggcggcg 
cgccccgtgc 
tactcccaca 
tttaatgacg 
gccctttgtg 
gccgcgtgcg 
gtgcgctccg 
ggctgcgagg 
gtgggcgcgg 
cggcccggct 
cggggggtgg 
ctcgggggag 
agccattgcc 
tggcggagcc 
gtgcggcgcc 
tccccttctc 
cggggcaggg 
catgttcatg 
tctcatcatt 
ccccctaacc 
aaagagt 1 1 g 
attgagttat 
tccgttacgt 
aagcagaaga 
gatgctccac 
gacgagggca 
gaggcaatag 
tcagaggtaa 
gaatcatcac 
gttggtgatt 
caaagcaaaa 
atatcaatga 
gcatctactc 
cgaaaagcat 
ttgtctgcaa 
cataagtcgg 
attgaaatca 



agtaatcaat 

ttacggtaaa 

tgacgtatgt 

atttacggta 

ctattgacgt 

gggactttcc 

gtgagcccca 

tatttattta 

gccaggcggg 

agccaatcag 

gccctataaa 

cccgctccgc 

ggtgagcggg 

gctcgtttct 

cgggggggag 

gcccgcgctg 

cgtgtgcgcg 

ggaacaaagg 

cggt cgggc t 

tcgggtgcgg 

cggcaggtgg 

gggcgcggcg 

ttttatggta 

gaaatctggg 

ggcaggaagg 

catctccagc 

cggggttcgg 

ccttcttctt 

ttggcaaaga 

tttatataag 

gattaggcag 

tttcaggaca 

tacattcatg 

cactcataaa 

ttgaagacat 

aggcggcgtc 

ctgaaggcca 

ggagatcaag 

catgttggct 

tatgcgaaat 

caggcgtaaa 

aggaaacact 

gtcgcgaacc 

caggtctttc 

gactctatga 

acaccatggc 

aataagaatt 



tacggggtca 
tggcccgcct 
tcccatagta 
aactgcccac 
caatgacggt 
tacttggcag 
cgttctgctt 
ttttttaatt 
gcggggcggg 
agcggcgcgc 
aagcgaagcg 
gccgcctcgc 
cgggacggcc 
tttctgtggc 
cggctcgggg 
cc cggcggct 
aggggagcgc 
ctgcgtgcgg 
gtaacccccc 
ggctccgtgc 
gggtgccggg 
gccccggagc 
atcgtgcgag 
aggcgccgcc 
aaatgggcgg 
ctcggggctg 
cttctggcgt 
tttcctacag 
attcatggga 
aaacaatgga 
agacaggcga 
caaacacaag 
gcttgatcgc 
ttacatgagc 
caccacaaaa 
agccaagtta 
tataacaaca 
acttacggct 
cagacttgca 
gaagtggtct 
aattgccatc 
tgataaatgc 
gctttcatcc 
cttcgaaggg 
gaagcagata 
atcacagtat 
cactcctcag 



ttagttcata 
ggctgaccgc 
acgccaatag 
ttggcagtac 
aaatggcccg 
tacatctacg 
cactctcccc 
attttgtgca 
gcgaggggcg 
t ccgaaagt t 
cgcggcgggc 
gccgcccgcc 
cttctcctcc 
tgcgtgaaag 
ggtgcgtgcg 
gtgagcgctg 
ggccgggggc 
ggtgtgtgcg 
cctgcacccc 

ggggcgtggc 
cggggcgggg 
gccggcggct 
agggcgcagg 
gcaccccctc 
ggagggc ct t 
ccgcaggggg 
gtgaccggcg 
ctcctgggca 
agaaggcgaa 
tattactgct 
atcgcaatca 
cctctgacag 
tacgaaaaaa 
aaaattaaag 
gaaattgcgg 
atcagatcaa 
aaccatgtcg 
gacgaatacc 
atggaactgg 
gatatcgtag 
ccaacagcat 
aaagagattc 
ggcacagtat 
gatccgccta 
agcgataagt 
cgtgatgaca 
gtgcaggctg 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 
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cctatcagaa ggtggtggct ggtgtggcca 
tttttccctc tgccaaaaat tatggggaca 
gctaataaag gaaatttatt ttcattgcaa 
tcggaaggac atatgggagg gcaaatcatt 
gtttggcaac atatgccata tgctggctgc 
cagtatatga aacagccccc tgctgtccat 
ggttagattt tttttatatt ttgttttgtg 
tccttacatg ttttactagc cagatttttc 
gtccctcttc tcttatgaag atccctcgac 
atagctgttt cctgtgtgaa attgttatcc 
aagcataaag tgtaaagcct ggggtgccta 
gcgctcactg cccgctttcc agtcgggaaa 
tagtcagcaa ccatagtccc gcccctaact 
tccgcccatt ctccgcccca tggctgacta 
gcctcggcct ctgagctatt ccagaagtag 
tgcaaaaagc taacttgttt attgcagctt 
caaatttcac aaataaagca tttttttcac 
tcaatgtatc ttatcatgtc tggatccgct 
aggcggtttg cgtattgggc gctcttccgc 
cgttcggctg cggcgagcgg tatcagctca 
atcaggggat aacgcaggaa agaacatgtg 
taaaaaggcc gcgttgctgg cgtttttcca 
aaatcgacgc tcaagtcaga ggtggcgaaa 
tccccctgga agctccctcg tgcgctctcc 
gtccgccttt ctcccttcgg gaagcgtggc 
cagttcggtg taggtcgttc gctccaagct 
cgaccgctgc gccttatccg gtaactatcg 
atcgccactg gcagcagcca ctggtaacag 
tacagagttc ttgaagtggt ggcctaacta 
ctgcgctctg ctgaagccag ttaccttcgg 
acaaaccacc gctggtagcg gtggtttttt 
aaaaggatct caagaagatc ctttgatctt 
aaactcacgt taagggattt tggtcatgag 
tttaaattaa aaatgaagtt ttaaatcaat 
cagttaccaa tgcttaatca gtgaggcacc 
catagttgcc tgactccccg tcgtgtagat 
ccccagtgct gcaatgatac cgcgagaccc 
aaaccagcca gccggaaggg ccgagcgcag 
ccagtctatt aattgttgcc gggaagctag 
caacgttgtt gccattgcta caggcatcgt 
attcagctcc ggttcccaac gatcaaggcg 
agcggttagc tccttcggtc ctccgatcgt 
actcatggtt atggcagcac tgcataattc 
ttctgtgact ggtgagtact caaccaagtc 
ttgctcttgc ccggcgtcaa tacgggataa 
gctcatcatt ggaaaacgtt cttcggggcg 
atccagttcg atgtaaccca ctcgtgcacc 
cagcgtttct gggtgagcaa aaacaggaag 
gacacggaaa tgttgaatac tcatactctt 
gggttattgt ctcatgagcg gatacatatt 
ggttccgcgc acatttcccc gaaaagtgcc 

<210> 128 
<211> 303 
<212> DNA 

<213> Artificial Sequence 
<220> 

<2 2 3> Human FER-1 Promoter 



atgccctggc tcacaaatac cactgagatc 2 880 
tcatgaagcc ccttgagcat ctgacttctg 2 94 0 
tagtgtgttg gaattttttg tgtctctcac 3 00 0 
taaaacatca gaatgagtat ttggtttaga 3 060 
catgaacaaa ggtggctata aagaggtcat 312 0 
tccttattcc atagaaaagc cttgacttga 3180 
ttattttttt ctttaacatc cctaaaattt 3240 
ctcctctcct gactactccc agtcatagct 3 300 
ctgcagccca agcttggcgt aatcatggtc 3360 
gctcacaatt ccacacaaca tacgagccgg 3 42 0 
atgagtgagc taactcacat taattgcgtt 3 480 
cctgtcgtgc cagcggatcc gcatctcaat 3 540 
ccgcccatcc cgcccctaac tccgcccagt 3 60 0 
atttttttta tttatgcaga ggccgaggcc 3660 
tgaggaggct tttttggagg cctaggcttt 3 72 0 
ataatggtta caaataaagc aatagcatca 3 780 
tgcattctag ttgtggtttg tccaaactca 3 84 0 
gcattaatga atcggccaac gcgcggggag 3 900 
ttcctcgctc actgactcgc tgcgctcggt 3 960 
ctcaaaggcg gtaatacggt tatccacaga 4 02 0 
agcaaaaggc cagcaaaagg ccaggaaccg 4O8 0 
taggctccgc ccccctgacg agcatcacaa 414 0 
cccgacagga ctataaagat accaggcgtt 4200 
tgttccgacc ctgccgctta ccggatacct 4260 
gctttctcaa tgctcacgct gtaggtatct 4320 
gggctgtgtg cacgaacccc ccgttcagcc 43 8 0 
tcttgagtcc aacccggtaa gacacgactt 444 0 
gattagcaga gcgaggtatg taggcggtgc 4 500 
cggctacact agaaggacag tatttggtat 4560 
aaaaagagtt ggtagctctt gatccggcaa 4620 
tgtttgcaag cagcagatta cgcgcagaaa 4 680 
ttctacgggg tctgacgctc agtggaacga 4 74 0 
attatcaaaa aggatcttca cctagatcct 4800 
ctaaagtata tatgagtaaa cttggtctga 4 860 
tatctcagcg atctgtctat ttcgttcatc 492 0 
aactacgata cgggagggct taccatctgg 4 980 
acgctcaccg gctccagatt tatcagcaat 5 040 
aagtggtcct gcaactttat ccgcctccat 5100 
agtaagtagt tcgccagtta atagtttgcg 5160 
ggtgtcacgc tcgtcgtttg gtatggcttc 522 0 
agttacatga tcccccatgt tgtgcaaaaa 5280 
tgtcagaagt aagttggccg cagtgttatc 5340 
tcttactgtc atgccatccg taagatgctt 54 0 0 
attctgagaa tagtgtatgc ggcgaccgag 54 60 
taccgcgcca catagcagaa ctttaaaagt 552 0 
aaaactctca aggatcttac cgctgttgag 5580 
caactgatct tcagcatctt ttactttcac 5640 
gcaaaatgcc gcaaaaaagg gaataagggc 5700 
cctttttcaa tattattgaa gcatttatca 5760 
tgaatgtatt tagaaaaata aacaaatagg 5 82 0 
acctg 5855 



<400> 128 

tccatgacaa agcacttttt gagcccaagc 
gacgccaccg ctgtcccaga ggcagtcggc 
gcgcgcgagg gcctccagcg gccgcccctc 
ccggaaggag cgggctcggg gcgggcggcg 
gcggctataa gagaccacaa gcgacccgca 
acc 



ccagcctagc tcgagctaaa cgggcacaga 60 
taccggtccc cgctcccgag ctccgccaga 12 0 
ccccacagca ggggcggggt cccgcgccca 180 
ctgattggcc ggggcgggcc tgacgccgac 24 0 
gggccagacg ttcttcgccg agagtcgggt 300 

303 
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<210> 129 
<211> 6521 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pIRES-BSR Plasmid 



<400> 129 

tcaatattgg 

ttggccattg 

aatatgaccg 

gtcattagtt 

gcctggctga 

agtaacgcca 

ccacttggca 

cggtaaatgg 

gcagtacatc 

caatgggcgt 

caatgggagt 

cgatcgcccg 

agcagagctc 

agttaaattg 

gactctctta 

ggttacaaga 

cttgcgtttc 

agg t g t cc ac 

ataggctagc 

tctccctccc 

tttgtctata 

cctggccctg 

caaggtctgt 

acgtctgtag 

ggccaaaagc 

gtgagttgga 

ctgaaggatg 

tgctttacat 

gtggttttcc 

acatttctca 

atgaggataa 

cggcagtaca 

ttggtagtgc 

cttattctga 

agttgatttc 

tcaaaactac 

ataccaagct 

gataagatac 

tatttgtgaa 

agttaacaac 

tttttaaagc 

tggcgtaata 

ggcgaatgga 

gcgtgaccgc 

ttctcgccac 

tccgatttag 

gtagtgggcc 

ttaatagtgg 

ttgatttata 

aaatatttaa 

tttctcctta 

ggcctgaaat 

agctgtggaa 

gtatgcaaag 

cagcaggcag 

taactccgcc 

gactaatttt 

agtagtgagg 



ccattagcca 
catacgttgt 
ccatgttggc 
catagcccat 
ccgcccaacg 
atagggactt 
gtacatcaag 
cccgcctggc 
tacgtattag 
ggatagcggt 
fctgttfctggc 
ccccgttgac 
gtttagtgaa 
ctaacgcagt 
aggtagcctt 
caggtttaag 
tgataggcac 
tcccagttca 
ctcgagaatt 
ccccccctaa 
tgtgattttc 
tcttcttgac 
tgaatgtcgt 
cgaccctttg 
cacgtgtata 
tagttgtgga 
cccagaaggt 
gtgtttagtc 
tttgaaaaac 
acaagatcta 
taaacatcat 
tattgaagcg 
agtttcgaat 
cgaagtagat 
agactatgca 
gattgaagaa 
tggcgggcgg 
attgatgagt 
atttgtgatg 
aacaattgca 
aagtaaaacc 
gcgaagaggc 
cgcgccctgt 
tacacttgcc 
gttcgccggc 
age tt t acgg 
atcgccctga 
actctfcgttc 

agggattttg 

cgcgaatttt 
cgcatctgtg 
aacctctgaa 
tgtgtgtcag 
catgcatctc 
aagtatgcaa 
catcccgccc 
ttttatttat 
aggctttttt 



tattattcat 
atctatatca 
attgattatt 
atatggagtt 
acccccgccc 
tccattgacg 
tgtatcatat 
attatgecca 
teategctat 
ttgactcacg 
accaaaatca 
gcaaatgggc 
ccgtcagatc 
cagtgettet 
gcagaagttg 
gagaccaata 
ctattggtct 
attacagctc 
cacgcgtcga 
cgt t ac t ggc 
caccatattg 
gagcattcct 
gaaggaagca 
caggcagegg 
agatacacct 
aagagtcaaa 
accccattgt 
gaggttaaaa 
acgatgataa 
gaattagtag 
gtgggagcgg 
tatataggac 
ggacaaaagg 
agaagtattc 
ccagattgtt 
ctcattccac 
ccgcttccct 
ttggacaaac 
etattgettt 
ttcattttat 
tctacaaatg 
ccgcaccgat 
ageggegcat 
agcgccctag 
tttccccgtc 
cacctcgacc 
tagacggttt 
caaactggaa 
ccgatttegg 
aacaaaatat 
eggtatttea 
agaggaactt 
ttagggtgtg 
aattagtcag 
ageatgeate 
ctaactccgc 
geagaggecg 
ggaggectag 



tggttatata 
taatatgtac 
gactagttat 
ccgcgttaca 
attgaegtea 
tcaatgggtg 
gccaagtccg 
gtacatgacc 
taccatggtg 
gggatttcca 
aegggacttt 
ggtaggcgtg 
actagaagct 
gacacaacag 
gtcgtgaggc 
gaaactgggc 
tactgacatc 
ttaaggctag 
geatgeatet 
cgaagccgct 
ccgtcttttg 
aggggtcttt 
gttcctctgg 
aaccccccac 
geaaaggegg 
tggctctcct 
atgggatctg 
aaaegtctag 
gcttgccaca 
aagtagegae 
caattegtae 
gagtaactgt 
attttgacac 
gagtggtaag 
ttgtgttaat 
tcaaatatac 
ttagtgaggg 
cacaactaga 
atttgtaacc 
gtttcaggtt 
tggtaaaatc 
cgcccttccc 
taagegegge 
cgcccgctcc 
aagctctaaa 
gcaaaaaact 
ttcgcccttt 
caacactcaa 
cctattggtt 
taacgtttac 
caccgcatac 
ggttaggtac 
gaaagtcccc 
caaccaggtg 
tcaattagtc 
ccagttccgc 
aggccgcctc 
gettttgeaa 



gcataaatca 
atttatattg 
taatagtaat 
taaefctaegg 
ataatgacgt 
gagtatttac 
ccccctattg 
ttaegggact 
atgcggtttt 
agtctccacc 
ccaaaatgtc 
tacggtggga 
ttattgcggt 
tctcgaactt 
actgggcagg 
ttgtcgagac 
cactttgcct 
agt act t aat 
agggeggeca 
tggaat aagg 
gcaatgtgag 
cccctctcgc 
aagcttcttg 
ctggcgacag 
cacaacccca 
caagegtatt 
atetggggee 
gccccccgaa 
acccaccatg 
agagaagatt 
gaaaacagga 
ttgtgcagaa 
gattgtagct 
tccttgtggt 
agaaatgaat 
ccgaaattaa 
ttaatgette 
atgcagtgaa 
attataagct 
cagggggaga 
cgataaggat 
aacagttgcg 
gggtgtggtg 
tttegcttte 
tegggggetc 
tgatttgggt 
gacgttggag 
ccctatctcg 
aaaaaatgag 
aatttcgect 
gcggatctgc 
cttctgaggc 
aggctcccca 
tggaaagtcc 
agcaaccata 
ccattctccg 
ggectctgag 
aaagcttgat 



atattggcta 
gctcatgtcc 
caattaeggg 
taaatggccc 
atgttcccat 
ggtaaactgc 
aegtcaatga 
ttcctacttg 
ggcagtacac 
ecattgaegt 
gtaacaactg 
ggtctatata 
agtttatcac 
aagctgcagt 
taagtatcaa 
agagaagact 
ttctctccac 
acgactcact 
attccgcccc 
ccggtgtgcg 
ggcccggaaa 
caaaggaatg 
aagacaaaca 
gtgcctctgc 
gtgccacgtt 
caacaagggg 
teggtgeaca 
ccacggggac 
aaaacattta 
acaatgettt 
gaaatcattt 
gecattgega 
gttagacacc 
atgtgtaggg 
ggcaagttag 
aagttttacc 
gagcagacat 
aaaaatgett 
gcaataaaca 
tgtgggaggt 
cgatccgggc 
cagectgaat 
gttacgegea 
ttcccttcct 
cctttagggt 
gatggttcac 
tccacgttct 
gtctattctt 
ctgatttaac 
gatgeggtat 
gcagcaccat 
ggaaagaacc 
gcaggcagaa 
ccaggctccc 
gtcccgcccc 
ccccatggct 
ctattccaga 
tcttctgaca 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 
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caacagtctc 
ttctccggcc 
ctgctctgat 
gaccgacctg 
ggccacgacg 
ctggctgcta 
cgagaaagt a 
ctgcccattc 
cggtcttgtc 
gttcgccagg 
tgcctgcttg 
ccggctgggt 
agagc t tggc 
ttcgcagcgc 
ttcgaaatga 
tttattttca 
cgtatggtgc 
cccgccaaca 
acaagctgtg 
acgcgcgaga 
aatggtttct 
tttatttttc 
gcttcaataa 
tccctttttt 
aaaagatgct 
cggtaagatc 
agttctgcta 
ccgcatacac 
tacggatggc 
tgcggccaac 
caacatgggg 
accaaacgac 
attaactggc 
ggataaagtt 
taaatctgga 
taagccctcc 
aaatagacag 
agtttactca 
ggtgaagatc 
ctgagcgtca 
cgtaatctgc 
tcaagagcta 
tactgtcctt 
tacatacctc 
tcttaccggg 

ggggggttcg 

acagcgtgag 
ggtaagcggc 
gtatctttat 
ctcgtcaggg 
ggccttttgc 



gaacttaagg 
gcttgggtgg 
gccgccgtgt 
tccggtgccc 
ggcgttcctt 
ttgggcgaag 
tccatcatgg 
gaccaccaag 
gatcaggatg 
ctcaaggcgc 
ccgaatatca 
gtggcggacc 
ggcgaatggg 
atcgccttct 
ccgaccaagc 
ttacatctgt 
actctcagta 
cccgctgacg 
accgtctccg 
cgaaagggcc 
tagacgtcag 
taaatacatt 
tattgaaaaa 
gcggcatttt 
gaagatcagt 
cttgagagtt 
tgtggcgcgg 
tattctcaga 
atgacagtaa 
ttacttctga 
gatcatgtaa 
gagcgtgaca 
gaactactta 
gcaggaccac 
gccggtgagc 
cgtatcgtag 
atcgctgaga 
tatatacttt 
ctttttgata 
gaccccgtag 
tgcttgcaaa 
ccaactcttt 
ctagtgtagc 
gctctgctaa 
ttggactcaa 
tgcacacagc 
ctatgagaaa 
agggtcggaa 
agtcctgtcg 
gggcggagcc 
tggccttttg 



ctagagccac 
agaggctatt 
tccggctgtc 
tgaatgaact 
gcgcagctgt 
tgccggggca 
ctgatgcaat 
cgaaacatcg 
atctggacga 
gcatgcccga 
tggtggaaaa 
gctatcagga 
ctgaccgctt 
atcgccttct 
gacgcccaac 
gtgttggttt 
caatctgctc 
cgccctgacg 
ggagctgcat 
tcgtgatacg 
gtggcacttt 
caaatatgta 
ggaagagtat 
gccttcctgt 
tgggtgcacg 
ttcgccccga 
tattatcccg 
atgacttggt 
gagaattatg 
caacgatcgg 
ctcgccttga 
ccacgatgcc 
ctctagcttc 
ttctgcgctc 

gtgggtctcg 

ttatctacac 
taggtgcctc 
agattgattt 
atctcatgac 
aaaagatcaa 
caaaaaaacc 
ttccgaaggt 
cgtagttagg 
tcctgttacc 
gacgatagtt 
ccagcttgga 
gcgccacgct 
caggagagcg 
ggtttcgcca 
tatggaaaaa 
ctcacatggc 



catgattgaa 
cggctatgac 
agcgc agggg 
gcaggacgag 
gctcgacgtt 
ggatctcctg 
gcggcggctg 
catcgagcga 
agagcatcag 
cggcgaggat 
tggccgcttt 
catagcgttg 
cctcgtgctt 
tgacgagttc 
ctgccatcac 
tttgtgtgaa 
tgatgccgca 
ggcttgtctg 
gtgtcagagg 
cctattttta 
tcggggaaat 
tccgctcatg 
gagtattcaa 
ttttgctcac 
agtgggttac 
agaacgtttt 
tattgacgcc 
tgagtactca 
cagtgctgcc 
aggaccgaag 
tcgttgggaa 
tgtagcaatg 
ccggcaacaa 
ggcccttccg 
cggtatcatt 
gacggggagt 
actgattaag 
aaaacttcat 
caaaatccct 
aggatcttct 
accgctacca 
aactggcttc 
ccaccacttc 
agtggctgct 
accggataag 
gcgaacgacc 
tcccgaaggg 
cacgagggag 
cctctgactt 
cgccagcaac 
tcgacagatc 



caagatggat 
tgggcacaac 
cgcccggttc 
gcagcgcggc 
gtcactgaag 
tcatctcacc 
catacgcttg 
gcacgtactc 
gggctcgcgc 
ctcgtcgtga 
tctggattca 
gctacccgtg 
tacggtatcg 
ttctgagcgg 
gatggccgca 
tcgatagcga 
tagttaagcc 
ctcccggcat 
ttttcaccgt 
taggttaatg 
gtgcgcggaa 
agacaataac 
catttccgtg 
ccagaaacgc 
atcgaactgg 
ccaatgatga 
gggcaagagc 
ccagtcacag 
ataaccatga 
gagctaaccg 
ccggagctga 
gcaacaacgt 
ttaatagact 
gctggctggt 
gcagcactgg 
caggcaacta 
cattggtaac 
ttttaattta 
taacgtgagt 
tgagatcctt 
gcggtggttt 
agcagagcgc 
aagaactctg 
gccagtggcg 
gcgcagcggt 
tacaccgaac 
agaaaggcgg 
cttccagggg 
gagcgtcgat 
gcggcctttt 
t 



tgcacgcagg 
agacaat egg 
tttttgtcaa 
tatcgtggct 
egggaaggga 
ttgctcctgc 
atccggctac 
ggatggaagc 
cagccgaact 
cccatggcga 
tcgactgtgg 
atattgetga 
ccgctcccga 
gactctgggg 
ataaaatatc 
taaggatccg 
agccccgaca 
ccgcttacag 
catcaccgaa 
tcatgataat 
cccctatttg 
cctgataaat 
tcgcccttat 
tggtgaaagt 
atctcaacag 
gcacttttaa 
aacteggteg 
aaaagcatct 
gtgataacac 
ettttttgea 
atgaagecat 
tgegcaaact 
ggatggaggc 
ttattgctga 
ggccagatgg 
tggatgaacg 
tgtcagacca 
aaaggatcta 
tttcgttcca 
tttttctgcg 
gtttgccgga 
agataccaaa 
tagcaccgcc 
ataagtegtg 
egggctgaac 

tgagatacct 
acaggtatcc 
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Group I, claim(s) 1-64, 67-71, 79, 84-36, 91-108, and 123, drawn to eukaryotic 
recombinogenic chromosomes used for introducing heterologous nucleic acids into a 
chromosome and the resulting cells. 

Group II, claim(s) 65-66 and 87-88, drawn to a lambda intR mutein. 

Group III, claim(s) 72-78, drawn to the production of transgenic animals. 

Group IV, claim ( s ) 80, drawn to the production of an artificial chromosome library. 

Group V, claim(s) 81-83, drawn to a library of cells for genomic screening. 

Group VI, claim (s) 89-90, drawn to a modified iron-induced promoter. 

Group VII, claim(3) 109-122, drawn to a method for screening compounds and their effects on 
regulatory regions. 
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of unity of invention (Rules 13.1, 13.2 and 13.3) for the reasons indicated below: 

The inventions listed as Groups I-VII do not relate to a single inventive concept under PCT 
Rule 13.1 because, under PCT Rule 13.2, they lack the same or corresponding special 
technical features for the following reasons: 

The special technical feature of Group I which defines an advance over the art is a 
eukaryotic chromosome containing recombinogenic 3ites that can be used to introduce 
heterologous nucleic acids into chromosomes, and cells containing these recombinogenic 
chromosomes. 

The special technical feature of Group II involves a lambda intR mutein. This feature 
defines an advance over Group I in that it involves a protein that is not required for the 
technical feaures as set forth above in Group I. 

The special technical feature of Group III involves the production of a transgenic animal, 
which represents a second method for using the invention as set forth above in Group I. 
The special technical feature of Group IV involves the production of artificial chromosome 
expression system libraries, which represents a third method for using the invention as set 
forth above in Group I . 

The special technical feature of Group V involves a library of cells containing the 
artificial chromosome expression system libraries set foth above in Group IV, and 
represents a first product resulting from Group IV. 

The special technical feature of Grotjp VI involves a modi f i sd i ron-inducible promoter. 
This feature defines an advance over Group I in that it involves a promoter that is not 
required for the technical features as set forth above in Group I. 

The special technical feature of Group VII involves a method for screening compounds for 
their effect on regulatory regions, which represents a fourth method of using the invention 
as set forth above in Group 3 . 
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